Below are extracts of the comments that I sent to Veritas today on this error.
Has anyone got any other thoughts other than restarting the services ??
I hope that a restart of the service has solved your problem LidoDeJesolo ?? I found out otherwise.
I have found this error only to effect one server. Once or Twice every other month to start off - now its 2 - 4 times a week. I am moving the share points and altering the login scripts accordingly tomorrow.
Its causing alot of disruption. With the client box crashing when a backup is taken from it. Freezes can;t do anything with it !!
Any thoughts ?
p.s. I have read the other threads in the forums
P.
Hi,
I am receiving the following error in Backup Exec for Windows Servers Version 9.00 Rev. 4454
************************************************************************************************
Error category : Resource Errors
Error code : a00084f8 HEX
Error description : A timeout occurred waiting for data from the agent during operation shutdown.
************************************************************************************************
The backup Job is scheduled for 0500 in the morning. I have moved the backup Job to the last in a series of 16 Jobs (this Job used to be scheduled for 2130 but once killed the following eight jobs by crashing the backup exec engine on the server - I can't afford for this to happen so I moved it to last in the Q)
This is a usual output from the job :-
Backed up 14410 files in 1569 directories.
Processed 3,427,371,664 bytes in 10 minutes and 38 seconds.
Throughput rate: 307 MB/min
More frequently though I have been receiving the error stated above and receiving the following :-
Backed up 6733 files in 1147 directories.
Processed 1,474,702,320 bytes in 37 minutes and 7 seconds.
Throughput rate: 37.9 MB/min
As you can see the rate drops (nearly 10x) and it is not a full backup (only gets about half way through)
Looking at the performance logging that I did on the server being backed up the problem occurred at 05:06 with the last entry for Processor time; Network Load being at 02/18/2004 05:06:01 - the Server being backed up actually *FREEZES* at this point - THE ONLY WAY TO RESTART IS TO POWER DOWN
I have enabled both the client and the server in debug mode. The client end of things dies at a certain time into the job usually after about 6 mins or so (1.5 GB data transmitted) - as stated THE ONLY WAY TO RESTART IS TO POWER DOWN
Client Log
**********
successful job
--------------
a0c 2/17/2004 12:08:28: Allocated 10 buffers, size 32768 bytes, total used: 328520
a0c 2/17/2004 12:08:28: TF_OpenSet()
a0c 2/17/2004 12:08:28: SetupFormatEnv( fmt=0 )
a0c 2/17/2004 12:08:28: End od TF_OpenSet() ret_val = 0, num buffers = 10
a0c 2/17/2004 12:08:28: Informational: Local share path F: used to populate the System Protected File table
a0c 2/17/2004 12:19:02: TF xfer time = 633 seconds.
a0c 2/17/2004 12:19:02: WRITE: tpreceive_fail_count = 31554
a0c 2/17/2004 12:19:02: WRITE: waiting_on_buffers_count = 31545
a0c 2/17/2004 12:19:02: WRITE: buffers_written_count = 105046
a0c 2/17/2004 12:19:02: TF_CloseSet()
a0c 2/17/2004 12:19:02: FreeFormatEnv( cur_fmt=0 )
a0c 2/17/2004 12:19:02: Detach from \\server1\F$
a0c 2/17/2004 12:19:02: TF_FreeDriveContext( 2FDD48 )
a0c 2/17/2004 12:19:02: TF_FreeTapeBuffers: from 10 to 0 buffers
a0c 2/17/2004 12:19:02: Job Stop(0) - Tue Feb 17 12:19:02 2004
failed job
----------
a2c 2/18/2004 5:01:30: Allocated 10 buffers, size 32768 bytes, total used: 328520
a2c 2/18/2004 5:01:30: TF_OpenSet()
a2c 2/18/2004 5:01:30: SetupFormatEnv( fmt=0 )
a2c 2/18/2004 5:01:30: End od TF_OpenSet() ret_val = 0, num buffers = 10
a2c 2/18/2004 5:01:30: Informational: Local share path F: used to populate the System Protected File table
^^^ last entry
From the Server logs I obtain the following
Server Log
**********
successful job
--------------
9c8 17/02/2004 12:08:28: OpenListenSocket: Media server IP address: 678ce186
9c8 17/02/2004 12:08:28: OpenListenSocket: Media server port: 6507
9c8 17/02/2004 12:08:28:
dataStartBackup: ndmpSendRequest returned: 0x0, 0
9c8 17/02/2004 12:19:02: TF_NDMPGetResult(): MediaServer thread done, returning TFLE 0
9c8 17/02/2004 12:19:02: NDMPEngine::MessagePumpAndWaitForResults(): TF_NDMPGetResult() returned 0
9c8 17/02/2004 12:19:03: data halted: SUCCESSFUL
9c8 17/02/2004 12:19:03: NDMPEngine: Shutting down.
9c8 17/02/2004 12:19:05: WriteEndSet( 1 ) returning 0
9c8 17/02/2004 12:19:07: WriteEndSet( 1 ) returning 0
9c8 17/02/2004 12:19:07: WriteEndSet( 0 ) returning 0
9c8 17/02/2004 12:19:07: HARDWARE COMPRESSION ===> Setting compression off.
9c8 17/02/2004 12:19:08: TF_CloseSet
9c8 17/02/2004 12:19:45: RewindDrive mover ret = 0 (0x0)
9c8 17/02/2004 12:19:45: ret_val = 0
9c8 17/02/2004 12:19:45: TAPEALERT: Get TapeAlert Flags Return Code = 0X0
9c8 17/02/2004 12:19:45: TAPEALERT: TapeAlert Device Flag = 0X0
9c8 17/02/2004 12:19:45: TAPEALERT: TapeAlert Changer Flag = 0X0
9c8 17/02/2004 12:19:45: TF_FreeDriveContext( 1D74FC0 )
9c8 17/02/2004 12:19:45: TF_FreeTapeBuffers: from 2 to 0 buffers
failed job
----------
60c 18/02/2004 05:01:30: OpenListenSocket: Media server IP address: 678ce186
60c 18/02/2004 05:01:30: OpenListenSocket: Media server port: 2a0f
60c 18/02/2004 05:01:30:
dataStartBackup: ndmpSendRequest returned: 0x0, 0
60c 18/02/2004 05:06:26: ERROR: ndmpcSendRequest->connection error
60c 18/02/2004 05:06:26: ERROR: ndmpSendRequest failed:
60c 18/02/2004 05:06:26: NDMPEngine: NDMP control connection lost.
540 18/02/2004 05:15:04: DeviceManager: timeout event fired
540 18/02/2004 05:15:04: DeviceManager: processing pending requests
540 18/02/2004 05:15:04: DeviceManager: going to sleep for 900000 msecs
540 18/02/2004 05:30:04: DeviceManager: timeout event fired
540 18/02/2004 05:30:04: DeviceManager: processing pending requests
540 18/02/2004 05:30:04: DeviceManager: going to sleep for 900000 msecs
60c 18/02/2004 05:36:26: NDMPEngine::MessagePumpAndWaitForResults(): TF_NDMPGetResult() timer elapsed!
60c 18/02/2004 05:36:26: ERROR: ndmpcSendRequest->connection error
60c 18/02/2004 05:36:26: ERROR: ndmpSendRequest failed:
60c 18/02/2004 05:38:28: WriteEndSet( 1 ) returning 0
60c 18/02/2004 05:38:30: WriteEndSet( 1 ) returning 0
60c 18/02/2004 05:38:30: WriteEndSet( 0 ) returning 0
60c 18/02/2004 05:38:30: HARDWARE COMPRESSION ===> Setting compression off.
60c 18/02/2004 05:38:37: TF_CloseSet
60c 18/02/2004 05:39:04: RewindDrive mover ret = 0 (0x0)
60c 18/02/2004 05:39:04: ret_val = 0
60c 18/02/2004 05:39:04: TAPEALERT: Get TapeAlert Flags Return Code = 0X0
60c 18/02/2004 05:39:04: TAPEALERT: TapeAlert Device Flag = 0X0
60c 18/02/2004 05:39:04: TAPEALERT: TapeAlert Changer Flag = 0X0
60c 18/02/2004 05:39:04: TF_FreeDriveContext( 1D74FC0 )
60c 18/02/2004 05:39:04: TF_FreeTapeBuffers: from 2 to 0 buffers
60c 18/02/2004 05:39:04: FreeFormatEnv( cur_fmt=0 )
540 18/02/2004 05:45:04: DeviceManager: timeout event fired
540 18/02/2004 05:45:04: DeviceManager: processing pending requests
540 18/02/2004 05:45:04: DeviceManager: going to sleep for 900000 msecs
540 18/02/2004 06:00:04: DeviceManager: timeout event fired
540 18/02/2004 06:00:04: DeviceManager: processing pending requests
540 18/02/2004 06:00:04: DeviceManager: going to sleep for 900000 msecs
540 18/02/2004 06:15:04: DeviceManager: timeout event fired
--------------------------------------------------------------------------------------------------------
Job Log (Failed)
- <joblog>
<job_log_version version="1.0" />
- <header>
<filler>======================================================================</filler>
<server>Job server: backupserver</server>
<name>Job name: 0500 server1 F$</name>
<start_time>Job started: 18 February 2004 at 05:00:04</start_time>
<type>Job type: Backup</type>
<log_name>Job Log: BEX02716.xml</log_name>
<filler>======================================================================</filler>
</header>
- <media_drive_and_media_info>
Drive and media information from media mount:
<drive_name>Drive Name: HP DAILY 80</drive_name>
<media_label>Media Label: W032_Tuesday</media_label>
<media_guid>Media GUID: {CB9C7E93-9BA3-46DA-ACEE-F4169180BAF4}</media_guid>
<media_overwrite_date>Overwrite Protected Until: 10/03/2004 04:09:25</media_overwrite_date>
<media_append_date>Appendable Until: 31/12/9999 00:00:00</media_append_date>
<media_set_target>Targeted Media Set Name: DLTWeekly</media_set_target>
</media_drive_and_media_info>
- <backup>
<filler>======================================================================</filler>
<title>Job Operation - Backup</title>
<append_or_overwrite>Media operation - append.</append_or_overwrite>
<compression>Hardware compression enabled.</compression>
<filler>======================================================================</filler>
<msgtitle_pre_jobstart>Starting Pre Job Command < net stop mcshield ></msgtitle_pre_jobstart>
- <set>
<set_resource_name>\\server1\F$</set_resource_name>
<tape_name>Family Name: "Media created 17/02/2004 19:30:05"</tape_name>
- <volume>
<display_volume>Backup of "\\server1\F$ "</display_volume>
</volume>
<description>Backup set #11 on storage media #1 Backup set description: "0500 server1 F$"</description>
<backup_type>Backup Type: COPY - Back Up Files</backup_type>
<start_time>Backup started on 18/02/2004 at 05:01:30.</start_time>
<info>Network control connection is established between backupserver:3879 <--> server1:10000</info>
<info>Network data connection is established between backupserver:3882 <--> server1:2516</info>
<end_time>Backup completed on 18/02/2004 at 05:38:37.</end_time>
- <summary>
<misc>Backed up 6733 files in 1147 directories.</misc>
<new_processed_bytes>Processed 1,474,702,320 bytes in 37 minutes and 7 seconds.</new_processed_bytes>
<vlm_hist_rateformat2>Throughput rate: 37.9 MB/min</vlm_hist_rateformat2>
</summary>
<filler>----------------------------------------------------------------------</filler>
</set>
</backup>
- <footer>
<filler>======================================================================</filler>
<end_time>Job ended: 18 February 2004 at 09:08:50</end_time>
<engine_completion_status>Job completion status: Failed</engine_completion_status>
<filler>======================================================================</filler>
<completeStatus>6</completeStatus>
<errorCode>Final error code: a00084f8 HEX</errorCode>
<errorDescription>Final error description: A timeout occurred waiting for data from the agent during operation shutdown.</errorDescription>
<errorCategory>Final error category: Resource Errors</errorCategory>
</footer>
</joblog>
--------------------------------------------------------------------------------------------------------
So far :-
I have followed numerous forum posts and have changed some registry settings to increase the timeout period for communication between the agent and the server - no change still get error
Settings were as follows :-
------
Symptom:
The error: "A timeout occurred waiting for data from the agent during operation shutdown" is returned when performing a backup operation with Backup Exec 9.0 for Windows Servers.
Exact Error Message:
a00084f8 HEX - A timeout occurred waiting for data from the agent during operation shutdown.
Solution:
This issue occurs when the timeout period expires for the Remote Agent for Windows Servers (RAWS).
To correct this problem, increase the timeout periods as follows:
1. Open regedit or regedt32 on the Backup Exec media server.
2. Increase the value of the following keys:
Set the registry value HKEY_LOCAL_MACHINE/Software/VERITAS/Backup Exec/Agent Browser/TCPIP/Expire Time to 1200 (Decimal)
Set the registry value HKEY_LOCAL_MACHINE/Software/VERITAS/Backup Exec/Engine/Agents/Data Connection Flush Timeout Seconds to 1800 (Decimal)
Set the registry value HKEY_LOCAL_MACHINE/Software/VERITAS/Backup Exec/Engine/Agents/NDMP Connect Open Time Out Seconds to 300 (Decimal)
Set the registry value HKEY_LOCAL_MACHINE/Software/VERITAS/Backup Exec/Engine/Agents/Notify Data Halted Time Out Seconds to 300 (Decimal)
Set the registry value HKEY_LOCAL_MACHINE/Software/VERITAS/Backup Exec/Network/TCPIP/Disconnect Delay to 1500 (Decimal)
Set the registry value HKEY_LOCAL_MACHINE/Software/VERITAS/Backup Exec/Network/TCPIP/WorkBufferSize to 32768 (Decimal)
Set the registry value HKEY_LOCAL_MACHINE/Software/VERITAS/Backup Exec/Engine/NTFS/Restrict Anonymous Support to 1. Create the value if necessary.
3. Stop all Backup Exec Services
4. Start up Backup Exec Services
------
Moved the server onto the local subnet to the backup server. This is too rule out switching problems. We backup seven servers in the same subnet without a problem.
Ruled out any potential problems with running Anti Virus alongside backup.
Ruled out media problems
Ruled out any potential ports being open at the same time (have not done this practically just from the logs)