×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!
  • Students Click Here

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Jobs

UCM DOWN, 1-way audio, dropped calls, audio fade in and out
2

UCM DOWN, 1-way audio, dropped calls, audio fade in and out

UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
CUCM 8.5, 3 nodes (1 Pub & 2 Subs)
second Sub recently added (middle of last week)
1400 users

First report 2 days ago (this past Tuesday), 1 enduser reported a dropped call, her phone (7942) shows UCM fail then goes to re-registering

Yesterday (Wednesday) about 5 more complaints, various but similar symptoms, some reporting 1-way audio, some reporting dropped call.

This is occurring internally within the local network on station to station calls as well as on external (trunk) calls.  On internal failures 1 party will see "UCM Down" the other party will see "Fail".

Last night we forced everything over to the *Pub* but today (Thurs) we're still getting isolated reports.

Recent changes:  Last week we added the 2nd sub. The original sub was 150 miles away and prior to adding a local sub everything was registering to the local *Pub* (bad design I know, but I didn't do it, the VAR did) - anyway after adding the 2nd sub (which is now local to us) we moved everything over from the Pub to the local Sub. That was a week ago today.

No problem reports last Friday or this past Monday. Problem reports started coming in on Tuesday of this week, 5 days after introducing the new (local) Sub to the mix and moving everyone over to it.

Ideas welcomed.
Thanks!!

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
Here's a dump from a phone that experienced the problem at 12:48 PM today

Console Log 53:

CODE

|== Syslogd TNP== Thu Mar 31 12:48:38 2011
====================================================
2224: NOT 12:48:38.217138 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - --->ConfigManager PropertyChanged: device.callagent.callcount
2225: NOT 12:48:38.218713 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - <---ConfigManager PropertyChanged: device.callagent.callcount
2226: NOT 12:48:38.220462 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - --->ConfigManager PropertyChanged: device.callagent.callcount
2227: NOT 12:48:38.222012 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - <---ConfigManager PropertyChanged: device.callagent.callcount
2228: NOT 12:48:40.075773 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - --->ConfigManager PropertyChanged: device.callagent.callcount
2229: NOT 12:48:40.077417 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - <---ConfigManager PropertyChanged: device.callagent.callcount
2230: NOT 12:48:40.079006 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - --->ConfigManager PropertyChanged: device.callagent.callcount
2231: NOT 12:48:40.081377 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - <---ConfigManager PropertyChanged: device.callagent.callcount
2232: NOT 12:48:43.687511 DSP: ====== A phone call starts ....
2233: NOT 12:48:43.688241 DSP:  ETHSTAT-  (unicast, broadcast, multicast) rx  = 814907, 71456, 65769; tx = 850160, 577
2234: NOT 12:48:43.688921 DSP:  MIB2-  ipInDelivers= 806095, udpInDG= 726890, udpOutDG= 724081, udpNoPort= 1727, udpInErr= 0, icmpInDestUnreach = 32, icmpOutDestUnreach = 14
2235: NOT 12:48:43.689630 DSP: STREAM- OpenEgressChan- ChanType 1, local (multicast host 0, port 0), MedType 6, Period 20, stream (17140014, 17140014) mix (1, 3)
2236: NOT 12:48:43.690391 DSP: STREAM- OpenEgressChan --> local port x0, reserved port x0, --> Chan 0
2237: NOT 12:48:43.692052 DSP: Subtracted for CODEC[4] G.722 direction:0 cost:13 old budget:100
2238: NOT 12:48:43.695307 DSP: TRSTREAM, isrEgressReqDiscardpreStreaming 0
2239: NOT 12:48:43.742584 DSP: STREAM- OpenIngressChan- ChanType 1, Remote (host af1ad15, port 5362), medType 6, Period 20, VAD 0, TOS b8, stream (17140014, 17140014) --> chan 0
2240: NOT 12:48:43.743368 DSP: STREAM- OpenIngressChan- mix (1, 3), dtmfpayloadtype 0
2241: WRN 12:48:43.744115 DSP: bad ARP Add, -126
2242: NOT 12:48:43.744807 DSP: Subtracted for CODEC[4] G.722 direction:1 cost:20 old budget:87
2243: NOT 12:48:43.934383 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - --->ConfigManager PropertyChanged: device.callagent.callcount
2244: NOT 12:48:43.937429 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - <---ConfigManager PropertyChanged: device.callagent.callcount
2245: WRN 12:50:46.548171 JVM: Startup Module Loader|cip.sccp.cn:? - Socket Timeout exception: cip.io.SocketTimeoutException: Connection timed out: Connection timed out
 Close(d) Connection ..., errno=145
2246: ERR 12:50:46.556467 JVM: 12:50:46p|cip.io.SocketTimeoutException: Connection timed out: Connection timed out
    at cip.io.SecureInputStream.socketRead([BII)I(Native Method)
    at cip.io.SecureInputStream.read([BII)I(Unknown Source)
    at java.io.BufferedInputStream.fill()V(Unknown Source)
    at java.io.BufferedInputStream.read()I(Unknown Source)
    at java.io.DataInputStream.readInt()I(Unknown Source)
    at cip.io.i.readInt()I(Unknown Source)
    at cip.sccp.ax.a()Lcip/sccp/bu;(Unknown Source)
    at cip.sccp.cn.e()V(Unknown Source)
    at cip.sys.l.run()V(Unknown Source)
    at java.lang.Thread.startup(Z)V(Unknown Source)
2247: WRN 12:50:46.558052 JVM: Startup Module Loader|cip.sccp.SccpEnhancedAlarmInfo:setLastDeregistrationReason - new reason=LastTimeTCPtimeout current=
2248: WRN 12:50:46.559152 JVM: Startup Module Loader|cip.sccp.CcApi:? - alarm sending failure:10: Name=SEP1C17D341D180 Load= SCCP45.9-1-1SR1S Last=TCP-timeout
2249: NOT 12:50:46.587698 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - propertyChanged - device.settings.fullyregistered value=false
2250: NOT 12:50:46.588815 JVM: Startup Module Loader|cip.midp.pushregistry.e:? - setAcceptConnections - DISABLED
2251: ERR 12:50:53.426303 login: [30]:loginSIGTERM signo:12
2252: NOT 12:50:53.432018 init: Starting /bin/login
2253: NOT 12:50:53.434971 init: /bin/login started as pid=24
2254: WRN 12:50:54.032446 JVM: Startup Module Loader|cip.sccp.cn:? - Socket Timeout exception: cip.io.SocketTimeoutException: Connection timed out: Connection timed out
 Close(d) Connection ..., errno=145
2255: ERR 12:50:54.040161 JVM: 12:50:54p|cip.io.SocketTimeoutException: Connection timed out: Connection timed out
    at cip.io.SecureInputStream.socketRead([BII)I(Native Method)
    at cip.io.SecureInputStream.read([BII)I(Unknown Source)
    at java.io.BufferedInputStream.fill()V(Unknown Source)
    at java.io.BufferedInputStream.read()I(Unknown Source)
    at java.io.DataInputStream.readInt()I(Unknown Source)
    at cip.io.i.readInt()I(Unknown Source)
    at cip.sccp.ax.a()Lcip/sccp/bu;(Unknown Source)
    at cip.sccp.cn.e()V(Unknown Source)
    at cip.sys.l.run()V(Unknown Source)
    at java.lang.Thread.startup(Z)V(Unknown Source)
2256: ERR 13:00:00.001279 NTP: Thu Mar 31 12:00:00 2011
2257: NOT 13:30:19.091768 DSP: STREAM- CloseIngressChan- ChanType 1, stream (17140014, 17140014) --> Chan 0
2258: NOT 13:30:19.098709 DSP: Added back for CODEC[4] G.722 direction:1 cost:20 old budget:67
2259: NOT 13:30:19.099576 DSP: STREAM- CloseEgressChan- ChanType 1, stream (17140014, 17140014) --> Chan 0
2260: NOT 13:30:19.103628 DSP: Added back for CODEC[4] G.722 direction:0 cost:13 old budget:87
2261: NOT 13:30:19.104665 DSP:  ETHSTAT-  (unicast, broadcast, multicast) rx  = 937819, 71637, 65894; tx = 974971, 578
2262: NOT 13:30:19.104968 DSP:  MIB2-  ipInDelivers= 928991, udpInDG= 849777, udpOutDG= 848852, udpNoPort= 1727, udpInErr= 0, icmpInDestUnreach = 32, icmpOutDestUnreach = 14
2263: WRN 13:30:19.354159 JVM: Startup Module Loader|cip.mmgr.dt:? - [MediaMgrSM]: Unhandled Event, State = StateOnHook Event = EventEndcall
2264: NOT 13:30:19.499863 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - --->ConfigManager PropertyChanged: device.callagent.callcount
2265: NOT 13:30:19.501014 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - <---ConfigManager PropertyChanged: device.callagent.callcount
2266: NOT 13:30:19.502106 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - --->ConfigManager PropertyChanged: device.callagent.callcount
2267: NOT 13:30:19.503200 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - <---ConfigManager PropertyChanged: device.callagent.callcount
2268: WRN 13:30:19.571530 CDP-D: lldpInetdStatsRsp: port: 0
2269: NOT 13:30:19.946171 JVM: LibMT (vieoProcess.c): sendVieoStart()
2270: WRN 13:30:20.367116 JVM: Startup Module Loader|cip.sccp.SccpEnhancedAlarmInfo:setLastDeregistrationReason - new reason=LastTimeInitialized current=
2271: WRN 13:30:20.373032 JVM: Startup Module Loader|cip.sccp.SccpEnhancedAlarmInfo:propertyChanged - name=device.settings.fullyregistered value=false
2272: NOT 13:30:20.380293 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - propertyChanged - device.callagent.messages.0 value=0
2273: NOT 13:30:20.382037 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - propertyChanged - device.callagent.messages.0 value=1
2274: NOT 13:30:20.414487 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - propertyChanged - device.settings.fullyregistered value=true
2275: NOT 13:30:20.416144 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - FULLY_REGISTERED - Resetting retry installer interval
2276: NOT 13:30:20.417738 JVM: Startup Module Loader|cip.midp.pushregistry.e:? - setAcceptConnections - ENABLED
2277: NOT 13:30:20.419305 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - propertyChanged - device.settings.fullyregistered value=true
2278: NOT 13:30:20.421115 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - FULLY_REGISTERED - Resetting retry installer interval
2279: NOT 13:30:20.422734 JVM: Startup Module Loader|cip.midp.pushregistry.e:? - setAcceptConnections - ENABLED
2280: NOT 13:30:20.424291 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - propertyChanged - device.callagent.messages.0 value=1
2281: NOT 13:30:20.425895 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - propertyChanged - device.callagent.messages.0 value=1
2282: NOT 13:30:20.427479 JVM: Startup Module Loader|cip.cfg.s:? - Requesting CONFIG file from TFTP Service(1)
2283: NOT 13:30:20.429091 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - --->ConfigManager PropertyChanged: device.callagent.callcount
2284: NOT 13:30:20.431442 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - <---ConfigManager PropertyChanged: device.callagent.callcount
2285: NOT 13:30:20.433034 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - --->ConfigManager PropertyChanged: device.callagent.callcount
2286: NOT 13:30:20.434559 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - <---ConfigManager PropertyChanged: device.callagent.callcount
2287: ERR 13:30:20.446244 JVM: tftpClient SEP1C17D341D180.cnf.xml /usr/ram/SEP1C17D341D180.cnf.xml 550001 1
2288: NOT 13:30:20.450766 tftpClient: tftp request rcv'd from /usr/tmp/tftp, srcFile = SEP1C17D341D180.cnf.xml, dstFile = /usr/ram/SEP1C17D341D180.cnf.xml max size = 550001
2289: NOT 13:30:20.471479 tftpClient: auth server - tftpList[0] = ::ffff:10.241.0.100
2290: NOT 13:30:20.472128 tftpClient: look up server - 0
2291: NOT 13:30:20.474399 SECD: lookupCTL: TFTP SRVR secure
2292: NOT 13:30:20.477321 tftpClient: secVal = 0x9
2293: NOT 13:30:20.478054 tftpClient: ::ffff:10.241.0.100 is a secure server
2294: NOT 13:30:20.478635 tftpClient: look up server - 1
2295: NOT 13:30:20.480923 SECD: lookupCTL: TFTP SRVR secure
2296: NOT 13:30:20.483673 tftpClient: secVal = 0x9
2297: NOT 13:30:20.484412 tftpClient: ::ffff:10.249.0.100 is a secure server
2298: NOT 13:30:20.484976 tftpClient: retval = SRVR_SECURE
2299: NOT 13:30:20.485577 tftpClient: Secure file requested
2300: NOT 13:30:20.486142 tftpClient: authenticated file approved - add .sgn -- SEP1C17D341D180.cnf.xml.sgn  
2301: WRN 13:30:20.516469 JVM: Startup Module Loader|cip.mmgr.dt:? - [MediaMgrSM]: Unhandled Event, State = StateOnHook Event = EventServicesTxStop
2302: NOT 13:30:20.540330 TFTP: [11]:Requesting SEP1C17D341D180.cnf.xml.sgn from 10.241.0.100 with size limit of 550001
2303: NOT 13:30:20.556679 TFTP: [11]:Finished --> rcvd 9585 bytes
2304: NOT 13:30:20.562741 SECD: verifyFile: sgn-verify </usr/ram/SEP1C17D341D180.cnf.xml>, 'name'[SEP1C17D341D180.cnf.xml.sgn]
2305: NOT 13:30:20.564624 SECD: parseHdr(): start of pad ('T' 0x0d) at TLV 15
2306: NOT 13:30:20.565312 SECD: parseHdr(): skipping 1 trail bytes (pad and/or unknown TLVs)
2307: NOT 13:30:20.567281 SECD: parseHdr(): start of pad ('T' 0x0d) at TLV 15
2308: NOT 13:30:20.567977 SECD: parseHdr(): skipping 1 trail bytes (pad and/or unknown TLVs)
2309: NOT 13:30:20.617840 SECD: file sgn verify SUCCESS, hdr 332 byte, </usr/ram/SEP1C17D341D180.cnf.xml>
2310: NOT 13:30:20.618724 SECD: verifyFile: file sgn verified </usr/ram/SEP1C17D341D180.cnf.xml>, hdrlen 332
2311: NOT 13:30:20.619443 SECD: verifyFile: hdr ver [2.0], and file not encr, </usr/ram/SEP1C17D341D180.cnf.xml>
2312: NOT 13:30:20.629804 SECD: verifyFile: 9253 byte after hdr strip, </usr/ram/SEP1C17D341D180.cnf.xml>
2313: NOT 13:30:20.630797 SECD: verifyFile: verify SUCCESS </usr/ram/SEP1C17D341D180.cnf.xml>
2314: NOT 13:30:20.634128 tftpClient: authorize file = 12, isEncr = 0
2315: NOT 13:30:20.640624 SECD: lookupCTL: TFTP SRVR secure
2316: ERR 13:30:21.421722 JVM: TVS server is : IPv4 : 10.241.0.100, IPv6 : , Port : 2445, IPv4 : 10.241.0.105, IPv6 : , Port : 2445, IPv4 : 10.249.0.100, IPv6 : , Port : 2445, IP Mode : 0
2317: NOT 13:30:21.438192 JVM: Startup Module Loader|cip.cfg.s:? - Config handleTftpResponse, status=0 for file=ram/SEP1C17D341D180.cnf.xml
2318: WRN 13:30:21.440132 JVM: Startup Module Loader|cip.xml.ap:parse - Encoding Updated to UTF-8
2319: WRN 13:30:21.441733 JVM: Startup Module Loader|cip.xml.ap:  - XML Parser Warning: Unknown element 'name' in element '/device/devicePool' (line=16)
2320: WRN 13:30:21.443337 JVM: Startup Module Loader|cip.xml.ap:  - XML Parser Warning: Unknown element 'name' in element '/device/devicePool/dateTimeSetting' (line=18)
2321: WRN 13:30:21.444964 JVM: Startup Module Loader|cip.xml.ap:  - XML Parser Warning: Unknown element 'name' in element '/device/devicePool/callManagerGroup' (line=38)
2322: WRN 13:30:21.446620 JVM: Startup Module Loader|cip.xml.ap:  - XML Parser Warning: Unknown element 'tftpDefault' in element '/device/devicePool/callManagerGroup' (line=39)
2323: WRN 13:30:21.448241 JVM: Startup Module Loader|cip.xml.ap:  - XML Parser Warning: Unknown element 'mgcpPorts' in element '/device/devicePool/callManagerGroup/members/member/callManager/ports' (line=49)
2324: WRN 13:30:21.449866 JVM: Startup Module Loader|cip.xml.ap:  - XML Parser Warning: Unknown element 'mgcpPorts' in element '/device/devicePool/callManagerGroup/members/member/callManager/ports' (line=65)
2325: WRN 13:30:21.452191 JVM: Startup Module Loader|cip.xml.ap:  - XML Parser Warning: Unknown element 'mgcpPorts' in element '/device/devicePool/callManagerGroup/members/member/callManager/ports' (line=81)
2326: WRN 13:30:21.453843 JVM: Startup Module Loader|cip.xml.ap:  - XML Parser Warning: Unknown element 'name' in element '/device/devicePool/srstInfo' (line=92)
2327: WRN 13:30:21.455468 JVM: Startup Module Loader|cip.xml.ap:  - XML Parser Warning: Unknown element 'userModifiable' in element '/device/devicePool/srstInfo' (line=94)
2328: WRN 13:30:21.457110 JVM: Startup Module Loader|cip.xml.ap:  - XML Parser Warning: Unknown element 'mlppDomainId' in element '/device/devicePool' (line=109)
2329: WRN 13:30:21.458744 JVM: Startup Module Loader|cip.xml.ap:  - XML Parser Warning: Unknown element 'mlppIndicationStatus' in element '/device/devicePool' (line=110)
2330: WRN 13:30:21.460702 JVM: Startup Module Loader|cip.xml.ap:  - XML Parser Warning: Unknown element 'preemption' in element '/device/devicePool' (line=111)
2331: WRN 13:30:21.462381 JVM: Startup Module Loader|cip.xml.ap:  - XML Parser Warning: Unknown element 'uid' in element '/device/networkLocaleInfo' (line=154)
2332: WRN 13:30:21.463939 JVM: Startup Module Loader|cip.xml.ap:  - XML Parser Warning: Unknown element 'mobility' in element '/device' (line=194)
2333: ERR 13:30:21.466268 JVM: Startup Module Loader|cip.cfg.s:? - DirectoryUrl http://10.241.0.100:8080/ccmcip/xmldirectory.jspsecuredirectoryUrl http://10.241.0.100:8080/ccmcip/xmldirectory.jsp
2334: NOT 13:30:21.470725 JVM: Startup Module Loader|cip.cfg.s:? - setConfigTvsProperty
2335: NOT 13:30:21.472292 JVM: Startup Module Loader|cip.sec.TvsProperty:? - TVS IPv4 - 1 :10.241.0.100TVS IPv6 - 1 :TVS Port - 1:2445TVS IPv4 - 2 :10.241.0.105TVS IPv6 - 2 :TVS Port - 2:2445TVS IPv4 - 3 :10.249.0.100TVS IPv6 - 3 :TVS Port - 3:2445
2336: NOT 13:30:21.473911 JVM: Startup Module Loader|cip.sec.TvsProperty:? - Resolve Tvs Ipv4 Address to 10.241.0.100from hostname 10.241.0.100
2337: NOT 13:30:21.475537 JVM: Startup Module Loader|cip.sec.TvsProperty:? - Resolve Tvs Ipv4 Address to 10.241.0.105from hostname 10.241.0.105
2338: NOT 13:30:21.477097 JVM: Startup Module Loader|cip.sec.TvsProperty:? - Resolve Tvs Ipv4 Address to 10.249.0.100from hostname 10.249.0.100
2339: NOT 13:30:21.496639 SECD: loadTvsSrvrCfg: Not in EMCC mode.Loading the flash file :/flash0/sec/misc/tvs.conf
2340: NOT 13:30:21.506375 JVM: emccMode=0,localOverride=0, tftpAddr1=, tftpAddr2=,tftpAddr3=,tftpAddr4=
2341: NOT 13:30:21.509529 tftpClient: tftp request rcv'd from /usr/tmp/tftp, emccMode =0, emccLocalOverride=0, tempTftp1= , tempTftp2 = , tempTftp3 = , tempTftp4 =  
2342: NOT 13:30:21.511581 JVM: setTempTftpAddress, emcc_mode=0,retEmccMode=0,LocalOverride=0,retLocalOverride=0, status=1
2343: NOT 13:30:21.516830 SECD: clearTFTPList: cleared all TFTP entries
2344: ERR 13:30:21.845660 JVM: Startup Module Loader|cip.cfg.s:? - Delete of sshUserInfo file failed
2345: ERR 13:30:21.847252 JVM: Startup Module Loader|cip.cfg.s:? - informationUrl is https://10.241.0.100:8443/ccmcip/GetTelecasterHelpText.jsp
2346: ERR 13:30:21.848854 JVM: Startup Module Loader|cip.cfg.s:? - directoriesUrl is http://10.241.0.100:8080/ccmcip/xmldirectory.jsp
2347: ERR 13:30:21.851189 JVM: Startup Module Loader|cip.cfg.s:? - messagesUrl is
2348: ERR 13:30:21.852731 JVM: Startup Module Loader|cip.cfg.s:? - servicesUrl is https://10.241.0.100:8443/ccmcip/getservicesmenu.jsp
2349: ERR 13:30:21.854336 JVM: Startup Module Loader|cip.cfg.s:? - authenticationUrl is https://10.241.0.100:8443/ccmcip/authenticate.jsp
2350: ERR 13:30:21.855848 JVM: Startup Module Loader|cip.cfg.s:? - idleUrl is
2351: ERR 13:30:21.857388 JVM: Startup Module Loader|cip.cfg.s:? - messagesUrl is null
2352: ERR 13:30:21.858964 JVM: Startup Module Loader|cip.cfg.s:? - After set servicesUrl device.settings.config.servicesurl value https://10.241.0.100:8443/ccmcip/getservicesmenu.jsp
2353: ERR 13:30:21.860880 JVM: Startup Module Loader|cip.cfg.s:? - After set info url device.settings.config.informationurl value https://10.241.0.100:8443/ccmcip/GetTelecasterHelpText.jsp
2354: ERR 13:30:21.862550 JVM: Startup Module Loader|cip.cfg.s:? - After set  dir Url device.settings.config.directoriesurl value http://10.241.0.100:8080/ccmcip/xmldirectory.jsp
2355: ERR 13:30:21.864129 JVM: Startup Module Loader|cip.cfg.s:? -  idle Url of null
2356: ERR 13:30:21.865712 JVM: Startup Module Loader|cip.cfg.s:? - After set  sec auth Url device.settings.config.authenticationurl value https://10.241.0.100:8443/ccmcip/authenticate.jsp
2357: ERR 13:30:21.867369 JVM: Startup Module Loader|VendorConfig:? - vendorconfig : setConfigProperties
2358: ERR 13:30:21.868873 JVM: Startup Module Loader|VendorConfig:? -
About to set WebAccess
2359: ERR 13:30:21.871203 JVM: Startup Module Loader|VendorConfig:? - WebAccess true
2360: ERR 13:30:21.872752 JVM: Startup Module Loader|VendorConfig:? - WebProtocol 0
2361: NOT 13:30:21.874546 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - propertyChanged - device.settings.config.localization.userlocale.charset value=iso-8859-1
2362: NOT 13:30:21.876212 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - propertyChanged - device.settings.config.localization.userlocale.languagecode value=en_US
2363: NOT 13:30:21.882483 SECD: setSecMode: sec mode set to NONE (was NONE)
2364: NOT 13:30:22.645532 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - propertyChanged - device.settings.config.phoneservices value=cip.props.XmlProperty@2f80cf
2365: NOT 13:30:22.647251 INETD: Set IP mode 1
2366: NOT 13:30:22.647909 INETD: Requestted IP mode is same as current mode
2367: DBG 13:30:22.654302 VPNU: SM wakeup - chld=0 tmr=0 io=1 res=0
2368: DBG 13:30:22.667099 VPNU: No VPN database change
2369: NOT 13:30:22.671966 SECD: clearSRSTList: cleared all SRST entries
2370: WRN 13:30:22.815817 SECD: WARN:cancelCapfOp: CAPF not in use, user cancel ignored
2371: NOT 13:30:22.816507 SECD: clearCapfList: CAPF table cleared
2372: NOT 13:30:23.162181 JVM: Startup Module Loader|cip.cfg.s:? - CUCM in config file: #0 XmlCallAgentObject: IPv4-name=[10.241.0.100] IPv6-name=[] port=2000 priority=0
2373: NOT 13:30:23.163806 JVM: Startup Module Loader|cip.cfg.s:? - CUCM in config file: #1 XmlCallAgentObject: IPv4-name=[10.241.0.105] IPv6-name=[] port=2000 priority=1
2374: NOT 13:30:23.165408 JVM: Startup Module Loader|cip.cfg.s:? - CUCM in config file: #2 XmlCallAgentObject: IPv4-name=[10.249.0.100] IPv6-name=[] port=2000 priority=2
2375: NOT 13:30:23.167028 JVM: Startup Module Loader|cip.cfg.s:? - CUCM in config file: #3 XmlCallAgentObject: IPv4-name=[] IPv6-name=[] port=0 priority=32767
2376: NOT 13:30:23.168595 JVM: Startup Module Loader|cip.cfg.s:? - CUCM in config file: #4 XmlCallAgentObject: IPv4-name=[] IPv6-name=[] port=0 priority=32767
2377: ERR 13:30:23.171762 JVM: Startup Module Loader|cip.sec.CapfProperty:? - Failed to resolve Capf Ipv4 Address with hostname
2378: ERR 13:30:23.173315 JVM: Startup Module Loader|cip.sec.CapfProperty:? - Failed to resolve Capf Ipv6 Address with hostname
2379: ERR 13:30:23.175660 JVM: Startup Module Loader|cip.sec.CapfProperty:? - No valid CAPF server
2380: NOT 13:30:23.177224 JVM: Startup Module Loader|cip.cfg.s:? - Config processConfigNoError() result code=CONFIG_FILE_NO_CHANGE
2381: NOT 13:30:23.178728 JVM: Startup Module Loader - Deletion of file Successful/usr/ram/SEP1C17D341D180.cnf.xml
 

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

so 10.241.0.100 is your pub.
What are these two ip's:
10.241.0.105
10.249.0.100

Nothing else was changed on the network? Any switch upgrades, routing changes, anything that would affect the network?

The fact that phones are reregistering could mean that they lose connectivity to the call managers. Unless you are running into a bug I would say you have a network issue.

The log above spans over an 1 1/2 by the way.  

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
No other changes than those listed:
1. Upgraded from 8.0 to 8.5
2. Added a 3rd node (sub) to the cluster
3. Pushed everyone over to the new sub.

As far as we know, we ran clean for about 4 days after the s/w upgrade and introduction of the 3rd node, but there was a weekend in the middle. 1st report of a problem came last Tuesday AM. About 1:30 Thursday AM (yesterday) we pushed the phones back over to the Pub (where they had been since original system commissioning almost a year ago) but the problem was still occurring. TAC is looking at it.

What's odd is that some internal calls are dropping. Network topology consists of 8560G POE switches back to a 6500 core, dual redundant gig fiber interfaces on the 8560G's. Looking at the interface on the 8560G we do not see the port flapping. Rock-stable. It's as though there's a gremlin in the codec on the phones.
 

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
10.241.0.100 is the pub
10.241.0.105 is the (new) sub
10.249.0.100 is the original sub (150 miles away)

Original configuration was with the phones (all 1400 of them) registering to the pub. We didn't register them to the original sub because it wasn't local to us.

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

This shows that the phone lost it's LAN connectivity to CUCM:

2254: WRN 12:50:54.032446 JVM: Startup Module Loader|cip.sccp.cn:? - Socket Timeout exception: cip.io.SocketTimeoutException: Connection timed out: Connection timed out
 Close(d) Connection ..., errno=145

Check the LAN...congestion? CUCM CPU screaming??

 

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
Thanks Lou.

Yes, we know what's occurring, the issue is why?  We continuously have Solar Winds monitoring literally every nook and cranny of the network and have seen *NO* congestion, busiest path never exceeding 30% utilization during peak loads over the last 30 days.

As stated, this problem just started showing up a few days (not immediately) after [list]
[*] upgrading from 8.0 to 8.5 and
[*] adding a 3rd node to the cluster (a local sub) and
[*] pushing the (approx 1400) sets over to the new sub from the pub[/list]

We tried pushing the sets back to the PUB, with no improvement. We of course still have 8.0 on the inactive partition & there's been some thought about going back, but my understanding is we'd have to migrate the database back to 8.0 as well.

There has been no network reconfiguration activity. The platform is with the CUCM connected to a 3560G w/dual redundant gig fiber back to a dual redundant 6500 (core). The core routers have dual redundant 2 gig links between them.  Out on the floor the phones connect to other 3560G switches in the wiring closets and these are connected back to the core via dual/redundant gig fibers. Voice is isolated on a separate VLAN.

We're going blind looking at sniffer traces. TAC agrees that initial diagnostics strongly suggest a network issue, but nothing definitive.

What about the keep-alive? What happens if a KA is missed? Didn't something change recently to increase that poll rate?

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
Keep Alive - Cisco onsite engineer's thoughts (quoted from Cisco docs)

Geometric TCP

The Cisco Unified IP Phone firmware 7.2(1) introduced a Geometric TCP mechanism to permit IP Phones to measure the round-trip delay between the IP Phone and Unified CM, then adapt the keepalive timeout value. This provided a very accurate failover mechanism when the network delay is consistent.

However, if the network delay is inconsistent, this mechanism may cause the IP Phones to inaccurately attempt failover. The Cisco Unified IP Phone firmware 8.4(2) introduces the ability for the Network Administrator to disable this behavior, if necessary, through the Detect Unified CM Connection Failure parameter defined on the IP Phone device configuration. The default value is Normal; this Geometric TCP mechanism can be disabled if the parameter is set to Delayed.

End quote.
 

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

The only other reason besides network delay/temp failure would be the call manager service stopping. I have seen that but in older 6.X versions. When that happens ALL your phones will reset as well as gateways, trunks, voicemail ,etc.
From your description this is not happening so you need to keep looking at your network.
Regardless of the method of keep alive a phone is using there should be NO reason that the call manager would not respond unless it is not reachable. Especially on your LAN.  Why is not reachable at that time? I can't tell you that unless I have ny hands on your network.

This would be my next steps if you have not done that already.

> Do whatever TAC is telling you regardless of "no network changes were done". I have heard that a million times but the posibility needs to be eliminated.
> Are the phones that are dropping repeat offenders or on the same switches? If you take a close look you migh be able to isolate it to a single closet or even switch.
>if you are positive that it was the upgrade go back to 8.0 and see if it goes away. I understand that database changes have been made since but it is a small price to pay to find out fast if it is the upgrade that caused it.

having said all that it is very possible that you are running into an bug with the new version or even an IOS issue/incompatibility that can cause that. make sure all your switch IOS is compatible with version 8.5 of call manager.
Unfortunately 8.5 is only a couple of months into general release so you might be discovering a new bug.

 

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
Thanks.

I have 36 wiring closets spread across 9 floors (approx 55,000 sq.ft. of office space per floor).  The phones are in VLAN 600. We have 2 class C networks per floor, with wiring closets A & B on one network and closets C & D on the other. This is convenient as it better organizes the floor layouts. IE, tell me your IP address and I know what floor you're on and what end of the building you are located.

The typical wiring closet in the building will have between 3 and 4 3560G POE switches, each switch having a dual/redundant fiber path back to dual/redundant 6500 core routers. Each of the fiber interfaces on every switch in every closet is monitored by Solar Winds. I'd like to think if something is going on, we'd see it.

So far there are no "repeat offenders". The problem is jumping around all over the building, though fortunately not prolific. We're hearing only 6 or 7 reports of the anomaly per day, though we also realize that maybe only 1 in 10 users will actually trouble themselves to call the Help Desk to report it, so how bad is the problem really? Hard to say. By some miracle it has so far avoided the executive suite. Knock on wood.

The problem coming on the heels of a system S/W upgrade makes us very suspicious of the new load, vers 8.5(1) but the symptoms all seem to point to a network problem. To the best of anyone's knowledge the network has not been touched in the last 10 days. We say this with some degree of certainty, as network config changes, even something as simple as naming an interface, all has to be written up and get the blessings of our internal change control process before it's done and violating that policy will get you fired.

I've read the 8.x docs on the version switching process which seems simple enough, albeit service-affecting, requiring it be a 2AM task. But what about our database? can we go back to 8.0 from 8.5 without losing everything that's been done in the interim?

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

No the dstabase is a snapshot of the existing db at the time of the upgrade. Any changes after the upgrade have to be manually entered. However, if you are positive it is the upgrade that caused this that is your only way out unless CISCO has a bug fix for you.

Here's the thing. Some of the symptoms you are describing like 1 way audio is always a network issue and to be specific routing.

Now mix that with audio fading complaints and you might have a DSP issue, but on internal dialing no DSP's are used.

Are any of your gateways resetting? how about any ATA's VG-224 etc? Ot just phones?

Have you double checked your DHCP scopes, leases etc?
Does Solar winds detect a duplex-mismatch on a port?

As I said too many variables to put a finger on it and I can be sending you on a wild goose chase.

If I were you I would involve the network team if that is not yourself and also get TAC on line. If this is service affecting make the case a P2 and they will work with you till the issue is identified and fixed.
 

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
Thanks

Here's a screenshot of the event right as it occurred (sdi log captured out of CUCM) - this was a 1-way audio event, failure code was reason code 6:

CODE

  9337  10:40:35.363 |InboundStim - KeepAliveMessage - Send KeepAlive to Device Controller. DeviceName=SEP1C17D3C304EC, TCPPid = [1.100.9.2931609], IPAddr=10.241.18.33, Port=52664, Device Controller=[1,51,2780341]|1,100,50,1.111385963^10.241.18.33^SEP1C17D3C304EC
   13378  10:40:38.071 |EndPointUnregistered - An endpoint has unregistered Device name:SEP1C17D3C304EC Device IP address:10.241.18.33 Protocol:SCCP Device type:434 Device description:Diane Anderson - 8004123 Reason Code:6 IPAddressAttributes:3 CallState:8004123-active10 App ID:Cisco CallManager Cluster ID:StandAloneCluster Node ID:HOUCUCM01|AlarmSEP1C17D3C304EC^*^SEP1C17D3C304EC
   13380  10:40:38.071 |Device UnRegister deviceName : SEP1C17D3C304EC, Protocol : 1|*^*^*
   13381  10:40:38.071 |DebugMsg deviceName : SEP1C17D3C304EC, DeviceType : 434, risClass: 1|*^*^*
   13382  10:40:38.072 |SCCP Device Unregister: deviceName(SEP1C17D3C304EC), Protocol(1), RegisteredSCCP(1441), RegisteredSIP(8), Registered(1449)|*^*^*
   13383  10:40:38.072 |Device Unregister: deviceName(SEP1C17D3C304EC), Protocol(1), RegisteredSCCP(1441), RegisteredSIP(8), Registered(1449), Unregistered(23)|*^*^*
   13387  10:40:38.071 |Closing Station connection DeviceName=SEP1C17D3C304EC, TCPPid = [1.100.9.2931609], IPAddr=10.241.18.33, Port=52664, Device Controller=[1,51,2780341]|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13389  10:40:38.072 |sendPublishOut starts.|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13390  10:40:38.072 |sendPublishOut: Removal of Publish status, and the mEtag is empty, Don't Publish|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13391  10:40:38.072 |PublishEPA Deleted|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13394  10:40:38.072 |StationD:    (2549381) DEBUG- star_DSetCallState(14) State of cdpc(245703) is 10.|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13395  10:40:38.072 |StationD:    (2549381) DEBUG- star_DSetCallState(14) State of cdpc(245703) is 14.|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13396  10:40:38.072 |StationD:    (2780341) DEBUG- star_DSetCallState(14) State of cdpc(245708) is 10.|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13397  10:40:38.072 |StationD:    (2780341) DEBUG- star_DSetCallState(14) State of cdpc(245708) is 14.|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13398  10:40:38.073 |MatrixControl:updatePartyMediaCoordinatorNodeId: party1 videoCapable=0, party 2 videocapable=0|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13399  10:40:38.073 |StationCdpc - INFO: clearType=1, mHoldFlag=0, mOtherSideMediaConnFlag=1.|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13400  10:40:38.073 |StationCdpc - INFO: Need to Quit Clear and Media is up.|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13401  10:40:38.073 |StationD:    (2549381) DEBUG- star_DSetCallState(14) State of cdpc(245703) is 14.|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13402  10:40:38.073 |StationD:    (2549381) SelectSoftKeys instance=3 reference=17216256 softKeySetIndex=8 validKeyMask=ffffffff.|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13403  10:40:38.073 |StationD:    (2549381) DisplayPromptStatus timeOut=0 Status='€#' content='Temporary failure' line=3 CI=17216256 ver=85720014.|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13404  10:40:38.073 |StationD:    (2549381) StationOutputDisplayText don't need to send, because mIsALegacyDevice = 0|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13405  10:40:38.073 |DeviceManager:star_DeviceStop Name=SEP1C17D3C304EC Key=650324ab-7d61-31a1-25f7-f6285d1dd095 RegisterDevice=1922 DualModeFlag=0|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13406  10:40:38.073 |SMDMSharedData::findLocalDevice - Name=SEP1C17D3C304EC Key=650324ab-7d61-31a1-25f7-f6285d1dd095 isActvie=1 Pid=(1,51,2780341) found|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13407  10:40:38.073 |SMDMSharedData::notifyUnRegisterAndDeleteSubscribeeState cepn[650324ab-7d61-31a1-25f7-f6285d1dd095]|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13408  10:40:38.073 |SMDMSharedData::findRemoteDeviceAny - Key=650324ab-7d61-31a1-25f7-f6285d1dd095 not in RemoteDeviceInfo hashmap|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13409  10:40:38.073 |SMDMSharedData::notifyUnRegisterAndDeleteSubscribeeState not found |1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13410  10:40:38.073 |PendingResetHelper::getCurrTimerId - timerId[0]|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13411  10:40:38.073 |DeviceManager:star_DeviceStop Name=8004123:f1788093-0e82-4330-8594-1d5ab7372a6c Key=811ce0d9-31cd-077a-6450-0e3836737fd9 RegisterDevice=1921 DualModeFlag=0|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13412  10:40:38.073 |SMDMSharedData::findLocalDevice - Name=8004123:f1788093-0e82-4330-8594-1d5ab7372a6c Key=811ce0d9-31cd-077a-6450-0e3836737fd9 isActvie=1 Pid=(1,157,8430) found|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13413  10:40:38.073 |SMDMSharedData::notifyUnRegisterAndDeleteSubscribeeState cepn[811ce0d9-31cd-077a-6450-0e3836737fd9]|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13414  10:40:38.073 |SMDMSharedData::findRemoteDeviceAny - Key=811ce0d9-31cd-077a-6450-0e3836737fd9 not in RemoteDeviceInfo hashmap|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13415  10:40:38.073 |SMDMSharedData::notifyUnRegisterAndDeleteSubscribeeState not found |1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13416  10:40:38.073 |DeviceManager:star_DeviceStop Name=8004139:f1788093-0e82-4330-8594-1d5ab7372a6c Key=ea9bc65d-d0a6-36a9-9356-efbded1c4d83 RegisterDevice=1921 DualModeFlag=0|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13417  10:40:38.073 |SMDMSharedData::findLocalDevice - Name=8004139:f1788093-0e82-4330-8594-1d5ab7372a6c Key=ea9bc65d-d0a6-36a9-9356-efbded1c4d83 isActvie=1 Pid=(1,157,8431) found|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13418  10:40:38.073 |SMDMSharedData::notifyUnRegisterAndDeleteSubscribeeState cepn[ea9bc65d-d0a6-36a9-9356-efbded1c4d83]|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13419  10:40:38.073 |SMDMSharedData::findRemoteDeviceAny - Key=ea9bc65d-d0a6-36a9-9356-efbded1c4d83 not in RemoteDeviceInfo hashmap|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   13420  10:40:38.073 |SMDMSharedData::notifyUnRegisterAndDeleteSubscribeeState not found |1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   15083  10:40:39.581 |StationD:    (2780341) DEBUG- star_DSetCallPhase updateACall=17216257 from Phase=1 to  callPhase=3.|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   15084  10:40:39.581 |StationD:    (2780341) DEBUG- star_DSetCallState(15) State of cdpc(245708) is 14.|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   15085  10:40:39.581 |StationD:    (2780341) ConnectionStatisticsReq directoryNum=8004123 callIdentifier=17216257 statsProcessingMode=0(clearStats), totalDiagReqs=1|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   15086  10:40:39.581 |StationD:    (2780341) SetLamp mode=1, stim=9 stimInst=1.|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   15087  10:40:39.581 |StationD:    (2780341) ClearPromptStatus lineInstance=1 callReference=17216257.|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   15088  10:40:39.581 |StationD:    (2780341) CloseReceiveChannel conferenceID=17216257 passThruPartyID=16931666.  myIP: IpAddr.type:0 ipv4Addr:0x0af11221(10.241.18.33) |1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   15089  10:40:39.581 |StationD:    (2780341) StopMediaTransmission conferenceID=17216257 passThruPartyID=16931666.  myIP: IpAddr.type:0 ipv4Addr:0x0af11221(10.241.18.33) |1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   15090  10:40:39.581 |StationD:    (2780341) CallState callState=2 lineInstance=1 callReference=17216257 privacy=0 sccp_precedenceLv=4 precedenceDm=0|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   15091  10:40:39.581 |StationD:    (2780341) SelectSoftKeys instance=0 reference=0 softKeySetIndex=0 validKeyMask=ffffffff.|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   15092  10:40:39.581 |StationD:    (2780341) DefineTimeDate timeDateInfo=4/4/2011 10:40:39,1 systemTime=1301931639.|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   15093  10:40:39.581 |StationD:    (2780341) SetSpeakerMode speakermode=2(Off).|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
   15094  10:40:39.582 |StationD:    (2780341) SetRinger ringMode=1(RingOff).|1,100,50,1.111382969^10.241.18.33^SEP1C17D3C304EC
 

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

From the system error message guide from CUCM 8.5(1) on reason code 6:
"ConnectivityError - The network connection between the device and Unified CM dropped before the device was fully registered. Possible causes include device power outage, network power outage, network configuration error, network delay, packet drops and packet corruption. It is also possible to get this error if the Unified CM node is experiencing high CPU usage. Verify that the device is powered up and operating, verify that there is network connectivity between the device and Unified CM, and verify the CPU utilization is in the safe range (you can monitor this via the CPU Pegging Alert in RTMT)."

So unless one of your server CPU's is pegged (use RTMT to monitor it as suggested above), the issue points to a power on network fault.
Have you tried rebooting each server starting with the pub?  

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
Rebooted the Pub (1st) and both Subs last night.
No improvement.

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
Cisco TAC (Routing & switching) yesterday determined, to their satisfaction, that we do NOT have a network problem.

Voice problems continue. Intermittent but frequent UCM Down screen messages, frequent cases of 1-way audio occurring after the call is already established. Occasional sets re-registering.

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
Increasing in intensity/frequency of occurrence.  What began as a single complaint a week ago has steadily progressed into more and more users reporting it each successive day. Today the problem is prolific, users are practically on their knees begging for relief.

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

What does RTMT show for CPU usage on the servers? Any delay in getting dialtone, etc.?

CUCM traces are showing that the phone unregisters...that happens because of communication issues between phone and CUCM server.

Sniffer on the LAN? See if something is running amok .. virus, etc.? I've seen issues occur when and SNMP 'walk' occurred intermittently on a network. Takes all the available bandwidth and starves all devices.

Sure sounds like something is bugging your LAN...
 

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
Cisco TAC (one gentleman in particular) has been especially helpful. We've sniffed the lan, again with TAC on a WebEx session with us. The network is clean & uncongested. It's *not* a network issue.

No delayed dialtone.

Now leaning toward the phone load. Reflashed all the phones last night back to 9.0.2 (from 9.1.1) - Waiting for the sun to come up now.

 

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
Unfortunately no improvement.
 

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

This is a good one mitel. keep up posted. Hopefully you got a decent TAC engineer. Have you escalated your case a P2 yet? I highly suggest you do.  

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
Back & forth, now TAC is again leaning toward the network. Opened a new case, this time w/Backbone, looking at the core (two 6500's w/ a couple 10g blades) they noted sporadic high CPU activity at one point hitting 96% and several hits in the 80 percentile over the past 72 hours. Something's definitely afoot, the core normally averages 15% CPU utilization occasional peaks never above 25%. Took a mtce window late last night, made sign of the cross & rebooted the core. Have to wait now for Monday's call volume to pick up to see if this made any improvement. (problem not seen when light traffic volume)

The 1-way audio is both internal (peer-to-peer) as well as external, set to PRI. Initially the call starts out fine, several minutes into the call the receive audio begins to garble and is subsequently lost for up to 15~20 seconds... 1 instrument (the one with loss of receive audio) also displays "UCM Down Features Disabled" indicating it has lost a path common to it's peer and to the CUCM (keep-alives lost). Wireshark traces have caught several of the events, confirming packet loss. Thought about spanning tree reconverging, but that's not it.

Interestingly sets connected via either of 2 wiring closets on the 3rd floor (physically nearest the CUCM) have never lost receive audio nor had UCM Down error, but have been in conversations w/peers on other floors/wiring closets where the reverse is true, always experiencing the failure inbound to them. (different networks, different voice Vlans). Not sure what this is telling us.

We (phone techs, network techs and 3 onsite SE's and TAC) are now fairly well convinced it's an obscure network issue, but my management still leaning toward it being a CUCM Load issue & lobbying us to "go back" to 8.0 because their feet are in the fire (and the problem of course has manifested itself on the phone system, where it logically would be seen first). We're presently dropping or experiencing intermittent 1-way audio on at least 1/4 to 1/3 of all call traffic (at 2500 calls per hour peak) on 1400+ lines. Users are screaming. Senior mgmt screaming. ***FIX THE F'ING PHONE SYSTEM*** Only problem with "going back" is it's now been 25 days since 8.5 loaded and we have a very dynamic database, lots of daily MACs & just finished a huge call tree 2 weeks ago with several CTI RP, translations and a couple hunt groups. "Going back" from a reprogramming perspective would be disasterous, and easily take 3 days at the keyboard to rebuild all that's changed.
 

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
All TAC cases this week are P2

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
Sunday 4/10:

I now have 4 test phones strategically located around the building in wiring closets where the 1-way audio, etc has been most prolific. Calls between each other are nailed up & there are 4 laptops w/wireshark running to hopefully capture something.

The traces previously captured definitely show intermittent packet loss, as tho QOS wasn't enabled, tho it definitely is. Packets aren't delayed or being resent, they're just plain gone. Wireshark is seeing it, but wtf's causing it???

Solar Winds (Engr Tool Set 9.x) w/network monitoring enabled on literally every up & downstream backbone fiber interface as well as on the core is not showing any congestion or latency anywhere on the Lan. Nothing above 30% utilization and most interfaces coasting at 10~15%. Is SW telling us the truth?

The IOS on the core is unarguably a museum-piece(12.2.18.SX.D9) circa 2K4 & desparately needs upgrading but has been stable for so many years that an upgrade will require special dispensation from many gods for us to push a new one to them.

Everyone is looking at each other and asking "what changed?" because CUCM had been running without issue since it was turned up last July. Even the recent upgrade to CUCM 8.5(1) ran flawless from 3/15 until 3/29 when the first report of trouble was called in. Since 3/30 our lives have been a living hell. Blank check for overtime, but not the kind of OT anyone wants. We're all about to have a stroke or heart attack. This office ran nearly 24 consecutive years on the old Mitel SX2000 without more than an hour total unscheduled downtime.  9 months into Cisco & our feet have been the fire for all of the last 2 weeks & the decision-makers that promoted this thing are looking to duck & cover. Rebooting the core Sat'dy didn't fix it.

 

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

is 12.2.18.SX.D9 CAT-OS? I would bet you money that it is not on the compatibility matrix with CUCM 8.5 as it is too old. therefore not supported.
I don't know a thing about solar winds network monitoring tool but it obviously cannot be trusted.
In your words:
" looking at the core (two 6500's w/ a couple 10g blades) they noted sporadic high CPU activity at one point hitting 96% and several hits in the 80 percentile over the past 72 hours."

Obviously solarwinds did not register that either so stop relying on it for this issue.

Bottom line the issue is not with CUCM but with a faulty network. You are in a converged network now and everything affects averything.
The network is dropping packets and that is a fact. So until that is fixed the phone system will suffer. You will need to find out where those packets are being dropped and eliminate/fix that device. I wouldn't be surprised if it is your IOS on the core.
Upgrade the 6500's and see what happens. I can't believe TAC has not asked you to do this yet. Something is failing on it whether it is IOS, a blade or the backplane is becoming an issue. Hard to say.  

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
Yes CAT OS

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
FOLLOW-UP

The fire is out, but its origin seems strange. A year or so ago we had forced the ARP timers on all of our LAN switches (3560/48) to 300 seconds (5 minutes).

Rather than take a shotgun approach & then never know what fixed it, we applied TAC's recommended changes en-masse, then gradually one-by-one backed the changes out gradually over the following week until we started getting complaints again.

The ARP timer turned out to be the trigger.  Cisco's default value for this is 4 hours.

According to our network engineer we had set this to 5 minutes to more or less accommodate the frequent comings and goings of various servers being changed out, but needing to reuse the old address for the new hardware. In this way the server team could go about their daily routine without having to pester the network team to flush the ARP cache every time a server or other piece of hardware was replaced. Okay, in retrospect maybe this wasn't such a good idea.

What puzzles me is why we were able to run 1400+ VOIP phones in this environment for 11 months without issue and only shortly after upgrading CUCM to 8.5(1) from 8.0(2) did the problem begin showing up?  IOW, what's changed in 8.5 making it so susceptible to ARP activity?

 

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

Assuming this is a rhetorical question here is my thesis:
lowering that timeout would most likely increase CPU load and if the CPU load increased enough would potentially affect the stability of the 6500 and thereby the stability of any network which depends on that device.
You mentioned that the 6500 were running at very high CPU during those incidents so there is your possible cause.

Why now and not earlier? Who knows? Maybe it coincided with the upgrade but unrelated to it (I'd bet beers on this).

By the way I've seen ARP issues cause total havoc on networks and that's what you experienced first hand.

I'm glad it's fixed though.  
 

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
Thanks.

By the way only as a note in passing, all 1400+ phones are registered to the PUB not to the sub. This has been the configuration from day-one.  Our VAR did this to us..... We have a SUB but it is not local, instead located offsite, 150 miles away in a Data Foundry hot site.  It's online, but nothing was ever registered to the (remote) SUB and still isn't.

We later learned (not from our VAR) that this was most likely an unsupported configuration and on the advice of 2 CCIE's we ordered another Sub to install here in the same rack as the PUB (where the sub logically belongs)

We did that, changed the fail-over order and actually forced-over the 1400+ local instruments to the *new* local SUB as one of the steps in upgrading the system from 8.0 to 8.5  However, the fit hit the shan 2 weeks later w/one-way audio and UCM down, etc. and so as part of a shotgun approach we forced everyone back to the PUB and physically shut down the new Sub. This is how it remains today, still offline, cold in the rack, with everything running on the PUB and the only running Sub still 150 miles distant.

We took such a whipping & these intermittent one-way audio drops were so high profile and so very disruptive to our business that our management is understandably gunshy about re-introducing that new (local) Sub back into the mix.  By contrast I think we're living on borrowed time with everything registered to the PUB. I'd sleep better with that new Sub up and running and the phones registered to it as they should be, but I'm unsure how to do it myself (I'm barely qualified to do daily MACS) and feel like I really can't trust the VAR. I feel like my VAR is TFW, totally (expletive deleted) worthless.

Thoughts?

 

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

Hi Mitel.. We also recently upgraded to 8.5 from 6.1 and also getting the exact issues you are encountering.  Phone resets on its own with or without a call, One way audio, Call history issues, devices going dummy up after a reset.  Nothing prior to 8.5.  We have 7 Clusters with over 80 Subscribers and close to 120k devices (Yes 120k).  Like you everything has the look and feel as if the network is the issue.  But we've looked and seems to be running as it should.  I will check the issue you ran into with the ARP table timers.  I feel like that calm before the storm, but its not a normal it will go away storm, its like the "Perfect Storm".  Who the f$%#$% does Cisco have working for them? Where did the get this piece of s#$^$.  Cisco used to stand for Stability, but not now.  Mitel let me know if you find anything else with this will you plz.  thx for the details and info.   

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

is the issue on all the phones and all the clusters?
Your description is very generic unless you are just here to vent.
Did your phones upgrade firmware during the upgrade?
Nothing else happened on your network that would be the same time as this? No other network upgrades or changes?

 

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

Whykap i am assuming your talking to me and not Mitel.. if so, so far just happening to two of our largest clusters.  Have not heard of anything from the other Clusters, but that not to say the users are just putting up with it.

Our Phones firmware were pre-upgraded from 8.5 to the current 9.1 prior to any of the CUCM upgrades.

No other network work/outages reported by our Network group.  The smaller clusters seem to run fine with no issues, but the larger ones seem to come up with weird things/symptoms that has never happened before, prior to 8.5 upgrade..

Symptoms;  Phone resets on its own, 1 way audio, Call history not consistent, Ringlist and our Company directories show "Host not found" error. These symptoms are very sporadic and maybe happening to more end users, but maybe not reporting because of the recent activities.

The Larger clusters have a Centralized TFTP design (one meaning 2 TFTP Server's in the DHCP option 150 scope, and those have the Alternates defined.)  This Design is shared between 3 Clusters 2 of the first ones running CUCM 8.5(1), and the last running 6.1(4).  its very hectic here since we  have been doing the upgrade this past 6months (7 cluster upgrades in 6months).

Yes multiple TAC's opened and same response, give us the log/s.  We did just like Mitel above they see a network disconnect.  I see some major alterations to CUCM 8.5 which i am not happy with. The key is ONLY after the CUCM 8.5 upgrade did this start happening.



 

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

Are any phones on the 6.X cluster experiencing similar issues?  

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
CISCO4ME:

Our "fire" is out and the incorrect setting of the ARP timers was determined to be the cause, as we felt like we had eliminated everything else. We believe that some obscure coding change in release 8.5 may have triggered it, but that's only a guess. Our severely reduced staff size and day-to-day workload does now allow us the luxury of spending time to pinpoint the actual root cause.

Our management was adamant this had to be a "phone system" problem and not a network problem, as everyone in mgmt. was still thinking from the old TDM point of view, where all calls passed through the PBX switch fabric for the duration of the call. In their mind 1-way audio and calls dropping had to be the PBX's fault, especially since we had just upgraded from 8.0 to 8.5. It took a lot (plural days) of whiteboarding and bringing in several CCIEs and opening P1 TAC cases, including Routing & Switching & Backbone to get the decision makers to finally understand how VOIP systems work.

Once they (mgmt) finally accepted this they were still pointing fingers at the possibility of the new set firmware being the cause, so to disprove this they had us downgrade all the instruments to the previous firmware load. This went poorly by the way, as it caused every user to lose all of their phone customizations (custom ring settings, background images, etc.) and still did not solve "the problem".

We wound up with laptops stuck in wiring closets all over the building running WireShark. Traces seemed to point to network congestion even though our network traffic never exceeded 40%.

With each passing day the fires were becoming hotter and of course the brand new phone system was directly in the crosshairs and being perceived as a complete POS. It no longer mattered whether it was a "network issue" or not. The end users saw it for exactly what it was; after more than 20 years with a "stable as Jesus" phone system, suddenly their NEW PHONES no longer worked. The month of April 2011 won't soon be forgotten around our place.

Through most of this crisis we had been trying to methodically troubleshoot & try different approaches, even putting our execs' phones on a separate, dedicated, non-routed network straight to the CUCM. Of course that put out "their' fires which had the additional benefit of giving credence to the original diagnosis, that this was never a phone problem, it was always a network problem.  

I do not know when or who came up with the idea to put the ARP timers back at their factory default settings. It was part of a "shotgun approach" that seemed to finally "fix it" so for several days no one knew exactly what the silver bullet was until we (1 step and 1 day at a time) began undoing the shotgun attack. Imagine everyone's surprise when we finally got down to the ARP timers. Within 1 hour of putting them back to the 300 seconds (5 minute) setting they had previously been set at (for a couple years without issue or incident) all of a sudden the 1-way audio and UCM down problems were back with a holy vengeance.

OK, ARP timers. - but now we've got a phone system that still to this day remains torn asunder, with execs on dedicated fibers running back to the phone room and the 6500 core split (no longer redundant) and everyone walking on eggs. We're satisfied that the ARP timer setting was the most likely cause, but we're all still licking our wounds from last April and gunshy of even touching what's working. We're even barred from making any further system OS upgrades, even in the face of known vulnerability CERTS. You walk past the CORE and feel like you need to make the sign of the cross.....  Breaking a hetofore bulletproof (and business vital) phone system in the corporate headquarters of a Fortune-50 energy corporation has that effect on you.

Bear in mind that you have only a very limited window of opportunity "to go back" to your prior load and with each passing day the amount of rework that will go along with a rollback grows exponentially. Every MAC you've done since the upgrade will likely have to be redone. Unless you're archiving your daily backups, you'll no longer have a viable 6.1 database to go back to in a matter of (I think) 10 days - could be less, I forget.

Good luck.

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

Can you maybe share the tac case number here, or the other suggestions tac made besides the arp change that corrected the problem?

We are having a very similar problem, with cucm 8.5.1 and have not yet found the problem, tac says it is not cucm, we get the tcp keepalive socket error.

Thanks

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

Mitel & all,

You made me so scared with this adventure that I want to run away from my present job  as we are planning to migrate to CUCM from Aastra (Former Ericsson) MD110 system in coming weeks. I have burnt my fingers few years ago while implementing VoIP & to covince my users thats its network issues & to be sorted in a different approach. It has its advantages but when you have issues, you need 100 wise guys to bring you out of the sh.. This thread has made my belief stronger. Also the fact that I have limited skills on networking & serious doubts on the design of our current network is already causing sleepless nights. Current topology seems to be far behind for handling this convergence of Voice, data & Video.

Anyhow guys wish you all goodluck & do pray for my adventure. In fact i will very much rely upon your experience & knowledge in the coming weeks..

Cheers,

""The truth about action must be known and the truth of inaction also must be known; even so the truth about prohibited action must be known. For mysterious are the ways of action""
 

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
The dust having long since settled and having had considerable time to go back and revisit all the various things we tried, we are today 100% certain that the problem was (is) network related and putting the ARP timers back to factory default has merely masked a larger network design issue that most likely is still lurking out there and will one day bite us again. At Cisco's recommendation ($$ naturally) we today are in the midst of replacing our 6500 (redundant core) with a pair of new 7000's. We will also be conducting a network rearchitecture starting sometime in the 1st quarter of 2K12.

We are also (today, finally) 100% certain that CUCM (and the 8.5.1 load) had nothing to do with the problem beyond anecdotal coincidence (and finger-pointing). The problem did not appear until a week or more after 8.5.1 was introduced. Granted that the new load was introduced around the Easter time period, over which a number of people were out of the office on holiday leave and network activity was therefore low.

CUCM did not (ever) go down. RTMT was not reporting any unusual call volume.  After deploying a half dozen packet sniffers we occasionally began to see some of the dropped packets, but with blowtorches blazing up our backsides (in the corporate office of a Fortune-500 company) I have to admit some of our work was shotgunning rather than based on any carefully-detailed scientific forensics. (When you're up to your neck in alligators, it can be awfully hard to focus on draining the swamp).  Things were especially high-profile as we had only recently replaced a 20+ year rock-stable/dependable (Mitel) system with what the customer suddenly perceived as junk. It did not matter that the data network was ultimately at fault - what the customer saw was that their phones were no longer dependable. An absolute public relations nightmare.

To anyone going through this the only advice I have is to remain focused. VOIP by its very nature is an extremely delicate communications medium. If a gnat farts anywhere in your network, your VOIP phones will be first to smell it. Stay focused... it's a network problem.

To anyone planning a VOIP phone system rollout, please for goodness' sake, bite the bullet and pay the $$ to have your network architecture professionally assesed and follow the recommended course of action. - at least that way you'll have some recourse if the fit hits the shan.


 

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

Dear All,

Kindly note that we are facing similar problem and have identified the root cause(hopefully). Our CUCMs 8.5.x are behind 65xx with FWSM Module. Whenever we add/delete/change any FWSM rule and hit SAVE/APPLY, it resets 40% of phones randomly. Will keep you posted once we get any resolution from CISCO as TAC case is still open.

I dont understand why other 100s of applications are working fine (thank God) and only IP Phones are giving problem.

I can only request you all at this time to have a maintenance window in your environment and add/delete/change a rule on FWSM (as we have) and apply the change AND same time monitor if phones get reset.

Hope this helps.

Regards,

Mohammad Ali

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

This is to all I hope helps.  The problem here has been found.  I caused very much grief to me and my whole department.

Since I received the notification of the last post by Mohammad I thought it necessary to let you know what we found.

Our environment has the exact same CUCM 8.5, behind 65xx with FWSM.

Our problem was that our CUCM is in the data center, multiples of our server staff were running backups of applications on the production network.

Our actual firewall/FWSM interfaces were getting congested over 100% causing very eratic problems with IP phones and remote voice gateways registered to the CUCM subscriber behind the firewall.

We removed all the backup traffic and are replacing the FWSM with ASA devices.

This problem was a killer.


John

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

Something interesting to note on page 61 of the CUCM SRND

"The recommendation to limit the number of devices in a single Unified Communications VLAN to
approximately 512 is not solely due to the need to control the amount of VLAN broadcast traffic. For
Linux-based Unified CM server platforms, the ARP cache has a hard limit of 1024 devices. Installing
Unified CM in a VLAN with an IP subnet containing more than 1024 devices can cause the Unified CM
server ARP cache to fill up quickly, which can seriously affect communications between the Unified CM
server and other Unified Communications endpoints. Even though the ARP cache size on
Windows-based Unified CM server platforms expands dynamically, Cisco strongly recommends a limit
of 512 devices in any VLAN regardless of the operating system used by the Unified CM server platform."

We ran into a similar issue and had to segment the broadcast domains. I suspect that the OP's issue was resolved by changing the ARP timers because the CUCM was learning too many.

Hope this helps someone.

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
Thanks Agent6376

What puzzles me is why we were able to run 1400+ VOIP phones in this "toxic" ARP environment for 11 months without issue and only shortly after upgrading CUCM to 8.5(1) from 8.0(2) did the problem begin showing up? IOW, what changed in 8.5 making it so susceptible to ARP activity? Something had to have changed.

We recently moved on to 8.6.2(a) and believe me, with last year's experiences still fresh in our minds, everyone's sphincters were tightly puckered when we switched over. Fortunately no major issues and we've been on 8.6.2(a) for a couple months now.

As you can probably imagine, our shop was a really scary place to be a year ago last April when all the trouble hit. Fortunately cooler heads prevailed, but there was a point last year that our Cisco Acct. exec. was afraid to show his face for fear that our Sr. execs were going to tell him to yank it out.

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

Agent6376:
" I suspect that the OP's issue was resolved by changing the ARP timers because the CUCM was learning too many."

But the OP said they changed it from the non-default 5-minutes to the Cisco default 4 hours, which fixed the problem.

So learning *too many* can't have been the problem.

Seeing as the problem co-incided with a CCM upgrade, I'd be interested in looking into how the CCM has changed the way it discovers handsets and how it reacts to a handset becoming unknown.

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

Just thinking about this - the calls were getting interrupted, right?

- handset to handset, ethernet path, all switches are seeing a continual flow of packets between them, so no ARP entries are going to time out relating to that traffic flow.
- The Wireshark traces showed "intermittent packet loss" - would be good to know what this actually means and whether it can be related to the call's QoS stats
- a bit of packet loss doesn't end a call/cause a tear-down to be sent.
- the CUCM log shows the Call Manager unregistering the phone
- this isn't a call being torn-down, it's a handset, active in a conversation, being unregistered from the system.

So you "fixed" this by fiddling with ARP timeouts.
Ethernet works just fine, regardless of ARP timeout value, so it could be an issue of configuration on your network in relation to frame flooding, or an application problem within the CCM.

I haven't played with CCM for a long time, so this question is very vague, but what are your timeout settings within CCM for handset registrations?
Could it be caused by a weird combination of low handset registration timeout, lower ARP timeout, and some kind of flood control config?

Having said that, running a new VoIP system on a network run on an ancient IOS is a bad idea, as is bodging your entire LAN switch configuration just to allow the server guys to perform unplanned cowboy work.

In this situation, I would be trying to convince Management that an 8-year-old network core is due for a refresh, and get them to fork out for a nice new VSS pair of 6500s, PLUS, a gentle re-architecture of the network to provide a proper distribution layer, especially, get all the server connections out of the network core and onto Server Data Centre switches.

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

(OP)
Thanks Vince.

We recently purchased a pair of 7000's and brought in some hired guns from some (Gartner Group recommended) 3rd party outfit (not Cisco) to do a full network re-architecture for us (long, long overdue). I believe this path was chosen so as to safely distance us from Cisco's zeal to up-sell us $50MM in new network hardware and influence the decision of what truly needs to be done. It may well cost us $50MM in the end, but with recommendations coming from networking pros with no skin in the game it's a lot easier to believe what they tell us vs listening to the VAR.

Re the comment about the ancient IOS, again you're preaching to the choir. The "data" network just ran and ran for years on end. Then when we wanted to migrate the phones to VOIP someone blurted out some statement like "COS is all you need" and suddenly the bus was loading to take us all to Abilene (Abilene Paradox - how we all got to Abilene when no one wanted to go), AKA "how to turn peanut oil into jet fuel"

Whatever the actual root cause of the problem, resetting the ARP timers back to default values either solved it or sufficiently masked the issue, for now anyway.

Original MUG/NAMU Charter Member

RE: UCM DOWN, 1-way audio, dropped calls, audio fade in and out

I'm seeing a similar issue at just one of my 10 mpls sites. Small office with just 10 phones, connected to hq via mpls over a t1 circuit. It is so sporadic but frustrating.

Where exactly do you change arp timers? My switches are 2960's and didn't see a similar command except for mac-address-table aging-timer.

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close