Ticket #3611 (closed Bugs: fixed)

Opened 4 years ago

Last modified 4 years ago

Orphaned Parked Calls locks FreePBX

Reported by: foneman Assigned to: xrobau
Priority: crash Milestone: 2.6
Component: Parking Lot Version: 2.5-branch
Keywords: Orphaned Call Cc:
Confirmation: Unreviewed Distro:
Backend Engine: Asterisk 1.4.x Distro Ver:
Backend Ver: SVN Revision (if applicable): 7967

Description

I have a customer that has a duplicatable problem. When a call is parked, the orphaned call is set to go to a ring group. The ring group does not answer but an extension rings. Once the call is finished, it seems that the system freezes. I am going to the lab to duplicate it myself and will post the outcome. This happened on another system of mine but we thought it was something else and never finalized the solution. I think this could be a parked call bug?

Change History

04/01/09 13:50:43 changed by foneman

UPDATED SCENARIO: So here it is in the order that it appears to error out - 1 - Call comes in and rings to our master ring group (600) 2 - Someone picks up the call and places it in the parking lot. 3 - After two minutes (timeout for my parking lot) call rings back to the extension that placed it in there to begin with. Not sure why this happens this way as our Orphaned Parked Calls are supposed to ring back to the master ring group (600) 4 - Same extension picks call back up as it rings in. 5 - After that call is hung up is when the error starts and just spikes the CPU to over 99%. I can duplicate this now so until I figure out why or what I can do to solve it I have increased our parked timeout to 4 minutes.

04/01/09 15:22:58 changed by francesco_r

I had similar problems with my customers that i have trained to use the park feature.
I always used to loop the ring groups (destination = the same ring group) with FreePBX 2.4 so an external call ring until the caller hangup.
With 2.4 the timeout returned normally to the extension that parked the call but orphaned call was lost (call hangup). Now that i have upgraded to 2.5 the orphaned loop the call (CPU 100%) also if the external caller hangup.
In these situations i'm forced to login in the asterisk console and "soft hangup" the local channel involved in the loop. I solved for the moment putting the hangup ad destination of the ring group after 60 seconds, intead of the ring group itself.

05/08/09 17:07:10 changed by richnyla

I have this problem as well. I have one location where they use the park feature quite frequently. I finally went through one of the 10gig log files to see what was going on and found it. If it goes unnoticed long enough you can get a monster sized log file. I'm going to do some more testing this weekend and see if I can find any more details regarding this problem.

05/08/09 17:19:42 changed by foneman

BUMP! Any insight on this?

05/14/09 16:40:27 changed by richnyla

I think I have located the problem here. It is beeing caused by a loop in the ring groups. I have about 7 of these systems runninging in the company and I was able to recreate the problem on all of them, they are all on different versions of Asterisk, but all running the same version of FreePBX. Anyway. I had ring groups that rolled over to another ring group if it did not get answered. I also had the 2nd ring group setup to go back to the 1st ring group if it was not answered. In this configuration, the double park sends the system into the endless loop.

To fix for now I changed my ring groups so it would not loop. I created a couple extra ring groups and change my plan so that when ring group 1 times out, the call goes to ring group 2, then it will go to a ring all group, then terminate the call. Of course I did the same with group 2 to group 1 to ring all. It takes 5 ring groups to do this, but it works.

06/05/09 18:06:02 changed by xrobau

  • owner changed from p_lindheimer to xrobau.
  • status changed from new to assigned.

Can you possibly copy and paste the actual dialplan commands that happen IN the loop? That'll make it a lot easier to diagnose.

eg, do:

asterisk -r (this connects to the asterisk console) then cause the loop then copy and paste the loop that's occuring (be generous at the start)

Thanks!

06/05/09 18:35:40 changed by richnyla

I'll have to recreate the problem, it has been so long I dont have the log files, they were huge logs (20GB+) I nad to import into a sql database so I could browse and see the cause of the problem.

My systems were setup with 2 ring groups, call them A and B. if ring group A did not get answered it was set to ring group B, and group B was set if there was no answer to ring group A. This is an obvious loop.

The problem came in when a call was parked, by transfering to the parking lot, and the call would ring back since it was not answered, then they would re-park the call. The system would begin and endless loop and bring the processer down to its knees. The system would be so overloaded it would not process new calls.

Again I'll recreate the process next week and send you a clip from the log, or a link to the entire log.

Rich

07/20/09 11:19:50 changed by geekct

Seem to have had an occurance of the same issue. Here is some of the loop:

[Jul 20 08:36:56] VERBOSE[13339] logger.c:     -- Executing [602@ext-group:8] Set("SIP/3001-b766c308<ZOMBIE>", "RRNODEST=") i
n new stack
[Jul 20 08:36:56] VERBOSE[13339] logger.c:     -- Executing [602@ext-group:9] Set("SIP/3001-b766c308<ZOMBIE>", "__NODEST=602"
) in new stack
[Jul 20 08:36:56] VERBOSE[13339] logger.c:     -- Executing [602@ext-group:10] Set("SIP/3001-b766c308<ZOMBIE>", "RecordMethod
=Group") in new stack
[Jul 20 08:36:56] VERBOSE[13339] logger.c:     -- Executing [602@ext-group:11] Macro("SIP/3001-b766c308<ZOMBIE>", "record-ena
ble|3001-3020-3006-3021-3005-3008-3002-3010-3007-3009-3018-3003|Group") in new stack
[Jul 20 08:36:56] VERBOSE[13339] logger.c:     -- Executing [s@macro-record-enable:1] GotoIf("SIP/3001-b766c308<ZOMBIE>", "1?
check") in new stack
[Jul 20 08:36:56] VERBOSE[13339] logger.c:     -- Goto (macro-record-enable,s,4)
[Jul 20 08:36:56] DEBUG[13339] app_macro.c: Executed application: GotoIf
[Jul 20 08:36:56] DEBUG[13339] app_macro.c: Extension s, macroexten 602, priority 3 returned normally even though call was hu
ng up
[Jul 20 08:36:56] VERBOSE[13339] logger.c:     -- Executing [602@ext-group:12] Set("SIP/3001-b766c308<ZOMBIE>", "RingGroupMet
hod=ringall") in new stack
[Jul 20 08:36:56] VERBOSE[13339] logger.c:     -- Executing [602@ext-group:13] Macro("SIP/3001-b766c308<ZOMBIE>", "dial|60|tr
|3001-3020-3006-3021-3005-3008-3002-3010-3007-3009-3018-3003") in new stack
[Jul 20 08:36:56] VERBOSE[13339] logger.c:     -- Executing [s@macro-dial:1] GotoIf("SIP/3001-b766c308<ZOMBIE>", "1?dial") in
 new stack
[Jul 20 08:36:56] VERBOSE[13339] logger.c:     -- Goto (macro-dial,s,3)
[Jul 20 08:36:56] DEBUG[13339] app_macro.c: Executed application: GotoIf
[Jul 20 08:36:56] DEBUG[13339] app_macro.c: Extension s, macroexten 602, priority 2 returned normally even though call was hu
ng up
[Jul 20 08:36:56] VERBOSE[13339] logger.c:     -- Executing [602@ext-group:14] Set("SIP/3001-b766c308<ZOMBIE>", "RingGroupMet
hod=") in new stack
[Jul 20 08:36:56] VERBOSE[13339] logger.c:     -- Executing [602@ext-group:15] GotoIf("SIP/3001-b766c308<ZOMBIE>", "0?nodest"
) in new stack
[Jul 20 08:36:56] VERBOSE[13339] logger.c:     -- Executing [602@ext-group:16] Set("SIP/3001-b766c308<ZOMBIE>", "__NODEST=")
in new stack
[Jul 20 08:36:56] VERBOSE[13339] logger.c:     -- Executing [602@ext-group:17] DBdel("SIP/3001-b766c308<ZOMBIE>", "BLKVM/602/
SIP/3001-b766c308<ZOMBIE>") in new stack
[Jul 20 08:36:56] VERBOSE[13339] logger.c:     -- DBdel: family=BLKVM, key=602/SIP/3001-b766c308<ZOMBIE>
[Jul 20 08:36:56] VERBOSE[18177] logger.c:     -- Executing [602@ext-group:18] Goto("SIP/3001-0a1a7860<ZOMBIE>", "ext-group|6
01|1") in new stack
[Jul 20 08:36:56] VERBOSE[18177] logger.c:     -- Goto (ext-group,601,1)
[Jul 20 08:36:56] VERBOSE[18177] logger.c:     -- Executing [601@ext-group:1] Macro("SIP/3001-0a1a7860<ZOMBIE>", "user-caller
id|") in new stack
[Jul 20 08:36:56] VERBOSE[18177] logger.c:     -- Executing [s@macro-user-callerid:1] Set("SIP/3001-0a1a7860<ZOMBIE>", "AMPUS
ER=3001") in new stack
[Jul 20 08:36:56] DEBUG[18177] app_macro.c: Executed application: Set
[Jul 20 08:36:56] DEBUG[18177] app_macro.c: Extension s, macroexten 601, priority 1 returned normally even though call was hu
ng up

I have 11GB of logs caputured. As this is a production box I cant "recreate" during business hours and I'm not sure the entry point. I had to dump 200GB of logs created byu the loop. It seemed to bounce between Ring Groups 601 and 602 every second. We added a third group and terminated the orphaned calls at a single ext. to prevent entry into the loop.

07/20/09 19:20:25 changed by p_lindheimer

checking with one of the developers on this. I thought he had found and fixed the problem but it doesn't appear anything has been updated in svn, hopefully he'll get back soon.

08/03/09 15:21:10 changed by jjshoe

  • svn_rev set to 7967.

fixed in r7967

08/07/09 18:20:26 changed by p_lindheimer

  • status changed from assigned to closed.
  • resolution set to fixed.

(In [7969]) Merged revisions 7967-7968 via svnmerge from http://svn.freepbx.org/modules/branches/2.5

........

r7967 | jjshoe | 2009-08-03 12:19:34 -0700 (Mon, 03 Aug 2009) | 2 lines

fixes #3611, fixes #3435, fixes #3317, fixes #3307 Fixed pattern matching to prevent deadly parked call loop

........

r7968 | jjshoe | 2009-08-03 12:28:18 -0700 (Mon, 03 Aug 2009) | 4 lines

fixes #3735 Don't send the actual secret in the email

Also warn if a secret is blank.

........