Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PhoenixTests crash with internal URLSession error #10

Open
atdrendel opened this issue Jun 4, 2020 · 13 comments · Fixed by #4
Open

PhoenixTests crash with internal URLSession error #10

atdrendel opened this issue Jun 4, 2020 · 13 comments · Fixed by #4
Labels

Comments

@atdrendel
Copy link
Contributor

Fatal error: Only one of message or error should be nil: file /AppleInternal/BuildRoot/Library/Caches/com.apple.xbs/Sources/swiftlang_overlay_Foundation/swiftlang-1103.8.25.8/swift/stdlib/public/Darwin/Foundation/URLSession.swift, line 46

Unsolved thread on Developer Forums

2020-06-04T22:25:54.0091290Z $$ creating join timer seconds(10) 1
2020-06-04T22:25:54.0091380Z socket input success(Phoenix.WebSocket.Message.string("[19,19,\"room:lobby\",\"phx_reply\",{\"response\":{},\"status\":\"ok\"}]"))
2020-06-04T22:25:54.0091540Z channel input channelMessage(Phoenix.IncomingMessage(joinRef: Optional(Phoenix.Ref(rawValue: 19)), ref: Optional(Phoenix.Ref(rawValue: 19)), topic: "room:lobby", event: Phoenix.PhxEvent.reply, payload: ["status": ok, "response": {
2020-06-04T22:25:54.0091680Z }]))
2020-06-04T22:25:54.0091830Z socket sending OutgoingMessage(joinRef: Optional(Phoenix.Ref(rawValue: 19)), ref: Phoenix.Ref(rawValue: 20), topic: "room:lobby", event: Phoenix.PhxEvent.custom("repeat"), payload: ["amount": 5, "echo": "This should be repeated"], sentAt: Dispatch.DispatchTime(rawValue: 519325202881))
2020-06-04T22:25:54.0091970Z socket sending data 75 bytes
2020-06-04T22:25:54.0092100Z socket input success(Phoenix.WebSocket.Message.string("[null,null,\"room:lobby\",\"repeated\",{\"echo\":\"This should be repeated\",\"n\":1}]"))
2020-06-04T22:25:54.0092260Z channel input channelMessage(Phoenix.IncomingMessage(joinRef: nil, ref: nil, topic: "room:lobby", event: Phoenix.PhxEvent.custom("repeated"), payload: ["echo": This should be repeated, "n": 1]))
2020-06-04T22:25:54.0092460Z socket input success(Phoenix.WebSocket.Message.string("[null,null,\"room:lobby\",\"repeated\",{\"echo\":\"This should be repeated\",\"n\":2}]"))
2020-06-04T22:25:54.0093190Z channel input channelMessage(Phoenix.IncomingMessage(joinRef: nil, ref: nil, topic: "room:lobby", event: Phoenix.PhxEvent.custom("repeated"), payload: ["n": 2, "echo": This should be repeated]))
2020-06-04T22:25:54.0093470Z socket input success(Phoenix.WebSocket.Message.string("[null,null,\"room:lobby\",\"repeated\",{\"echo\":\"This should be repeated\",\"n\":3}]"))
2020-06-04T22:25:54.0093710Z channel input channelMessage(Phoenix.IncomingMessage(joinRef: nil, ref: nil, topic: "room:lobby", event: Phoenix.PhxEvent.custom("repeated"), payload: ["n": 3, "echo": This should be repeated]))
2020-06-04T22:25:54.0093960Z socket input success(Phoenix.WebSocket.Message.string("[null,null,\"room:lobby\",\"repeated\",{\"echo\":\"This should be repeated\",\"n\":4}]"))
2020-06-04T22:25:54.0094190Z channel input channelMessage(Phoenix.IncomingMessage(joinRef: nil, ref: nil, topic: "room:lobby", event: Phoenix.PhxEvent.custom("repeated"), payload: ["echo": This should be repeated, "n": 4]))
2020-06-04T22:25:54.0095500Z Fatal error: Only one of message or error should be nil: file /AppleInternal/BuildRoot/Library/Caches/com.apple.xbs/Sources/swiftlang_overlay_Foundation/swiftlang-1103.8.25.8/swift/stdlib/public/Darwin/Foundation/URLSession.swift, line 46
2020-06-04T22:25:54.0096290Z Test Case '-[PhoenixTests.ChannelTests testReceiveMessages]' passed (0.015 seconds).
2020-06-04T22:25:54.0096960Z Test Case '-[PhoenixTests.ChannelTests testRejoinsAfterDisconnect]' started.
2020-06-04T22:25:54.0097200Z socket input success(Phoenix.WebSocket.Message.string("[null,null,\"room:lobby\",\"repeExited with signal code 4
2020-06-04T22:25:59.1240250Z ##[error]Process completed with exit code 1.
@atdrendel atdrendel added the bug label Jun 4, 2020
@atdrendel atdrendel linked a pull request Jun 4, 2020 that will close this issue
98 tasks
@atdrendel
Copy link
Contributor Author

This has happened locally on my MacBook Pro and when using GitHub Actions.

@myobie
Copy link
Member

myobie commented Jun 6, 2020

@atdrendel
Copy link
Contributor Author

This reproduced on my MacBook Pro just now. I gathered all system state and will try to catch this tomorrow.

URLSessionError.txt

@atdrendel
Copy link
Contributor Author

I haven't seen this one in a while. I'll keep it open, however, because it's always been rare.

@atdrendel
Copy link
Contributor Author

This happened a few times yesterday and today. Most recently, it happened with SocketTests.testPushOntoSocket

@atdrendel
Copy link
Contributor Author

It happened today. I wonder if I misread the previous crash log. Maybe it happens in SocketTests.testRemoteClosePublishesClose(), instead. At least, that's what it looks like from today's test log:

2020-06-27T14:58:06.3595270Z Test Case '-[PhoenixTests.SocketTests testPushOntoSocket]' started.
2020-06-27T14:58:06.3649020Z [info] CONNECTED TO ServerWeb.Socket in 172µs
2020-06-27T14:58:06.3658760Z   Transport: :websocket
2020-06-27T14:58:06.3658920Z   Serializer: Phoenix.Socket.V2.JSONSerializer
2020-06-27T14:58:06.3659020Z   Parameters: %{"user_id" => "59", "vsn" => "2.0.0"}
2020-06-27T14:58:06.3660420Z Fatal error: Only one of message or error should be nil: file /AppleInternal/BuildRoot/Library/Caches/com.apple.xbs/Sources/swiftlang_overlay_Foundation/swiftlang-1103.8.25.8/swift/stdlib/public/Darwin/Foundation/URLSession.swift, line 46
2020-06-27T14:58:06.3661020Z Test Case '-[PhoenixTests.SocketTests testPushOntoSocket]' passed (0.006 seconds).
2020-06-27T14:58:06.3661490Z Test Case '-[PhoenixTests.SocketTests testRemoteClosePublishesClose]' started.
2020-06-27T14:58:06.3672900Z Exited with signal code 4
2020-06-27T14:58:08.5229670Z [info] JOINED room:timeout in 3023ms
2020-06-27T14:58:08.5229920Z   Parameters: %{"join" => true, "timeout" => 3000}
2020-06-27T14:58:08.7514760Z [info] JOINED room:timeout in 3014ms
2020-06-27T14:58:08.7514960Z   Parameters: %{"join" => true, "timeout" => 3000}
2020-06-27T14:58:08.9526880Z [info] JOINED room:timeout in 3022ms
2020-06-27T14:58:08.9527180Z   Parameters: %{"join" => true, "timeout" => 3000}
2020-06-27T14:58:11.3695530Z ##[error]Process completed with exit code 1.
2020-06-27T14:58:11.3717930Z Cleaning up orphan processes
2020-06-27T14:58:11.4041660Z Terminate orphan process: pid (1580) (erl_child_setup)
2020-06-27T14:58:11.4106930Z Terminate orphan process: pid (1562) (beam.smp)

@atdrendel
Copy link
Contributor Author

The crash just happened again. It again came from our friend SocketTests.testRemoteClosePublishesClose(). Interesting.

Test Case '-[PhoenixTests.SocketTests testPushOntoSocket]' passed (0.017 seconds).
Test Case '-[PhoenixTests.SocketTests testRemoteClosePublishesClose]' started.
Fatal error: Only one of message or error should be nil: file /AppleInternal/BuildRoot/Library/Caches/com.apple.xbs/Sources/swiftlang_overlay_Foundation/swiftlang-1103.8.25.8/swift/stdlib/public/Darwin/Foundation/URLSession.swift, line 46
Exited with signal code 4

@atdrendel
Copy link
Contributor Author

Ok, so, maybe it's not as straightforward as we had hoped. Another crash while running the tests.

Test Case '-[PhoenixTests.ChannelTests testClosesChannelAfterReceivingOkResponseFromServer]' passed (0.015 seconds).
Test Case '-[PhoenixTests.ChannelTests testClosesChannelOnTimeoutOfLeavePush]' started.
[info] CONNECTED TO ServerWeb.Socket in 571µs
  Transport: :websocket
  Serializer: Phoenix.Socket.V2.JSONSerializer
  Parameters: %{"user_id" => "16", "vsn" => "2.0.0"}
[info] JOINED room:lobby in 29µs
  Parameters: %{}
Test Case '-[PhoenixTests.ChannelTests testClosesChannelOnTimeoutOfLeavePush]' passed (1.043 seconds).
Fatal error: Only one of message or error should be nil: file /AppleInternal/BuildRoot/Library/Caches/com.apple.xbs/Sources/swiftlang_overlay_Foundation/swiftlang-1103.8.25.8/swift/stdlib/public/Darwin/Foundation/URLSession.swift, line 46
Test Case '-[PhoenixTests.ChannelTests testClosingChannelSetsStateToClosed]' started.
Exited with signal code 4

@myobie
Copy link
Member

myobie commented Jul 2, 2020

@atdrendel they are both related to closing...

@atdrendel
Copy link
Contributor Author

atdrendel commented Jul 3, 2020

I've got Charles working correctly with our Phoenix WebSockets. I'm going to run our tests in a loop until they fail in an attempt to catch the exception and the input causing it.

Running tests in a loop:

while swift test -v; do :; done

Update

I ran swift test -v in a loop a bunch of times. It crashed/failed a few times, but I couldn't see the reasons why the runs failed. So, I'm changing my approach to using xcodebuild to run the tests.

# generate the Xcode project file
swift package generate-xcodeproj

# run the tests in a loop
while xcodebuild test -project Phoenix.xcodeproj -scheme Phoenix-Package; do :; done

@atdrendel
Copy link
Contributor Author

I managed to reproduce the crash while proxying the WebSocket traffic. Based on what I saw, the traffic was normal, as were the "requests" made before it.

Here's the information I captured during the crash.

Test log

Test Case '-[PhoenixTests.SocketTests testPushOntoSocket]' started.
2020-07-04 00:51:09.298627+0200 xctest[13201:8361315] Connection 60: received failure notification
Fatal error: Only one of message or error should be nil: file /AppleInternal/BuildRoot/Library/Caches/com.apple.xbs/Sources/swiftlang_overlay_Foundation/swiftlang-1103.8.25.8/swift/stdlib/public/Darwin/Foundation/URLSession.swift, line 46
Test Case '-[PhoenixTests.SocketTests testPushOntoSocket]' passed (0.031 seconds).
2020-07-04 00:51:09.327743+0200 xctest[13201:8361356] Fatal error: Only one of message or error should be nil: file /AppleInternal/BuildRoot/Library/Caches/com.apple.xbs/Sources/swiftlang_overlay_Foundation/swiftlang-1103.8.25.8/swift/stdlib/public/Darwin/Foundation/URLSession.swift, line 46
2020-07-04 00:51:29.671 xcodebuild[13194:8362667]  IDETestOperationsObserverDebug: Writing diagnostic log for test session to:
Restarting after unexpected exit, crash, or test timeout in SocketTests/testPushOntoSocket(); summary will include totals from previous launches.

WebSocket traffic

C: [null,1,"phoenix","heartbeat",{}]
S: [null,1,"phoenix","phx_reply",{"response":{},"status":"ok"}]

Crash log

Thread 8 Crashed:: Dispatch queue: WebSocket.serialQueue
0   libswiftCore.dylib            	0x00007fff7041230f _assertionFailure(_:_:file:line:flags:) + 863
1   libswiftFoundation.dylib      	0x00007fff70982a4a partial apply for closure #1 in NSURLSessionWebSocketTask.receive(completionHandler:) + 458
2   libswiftFoundation.dylib      	0x00007fff70930fc7 thunk for @escaping @callee_guaranteed (@guaranteed NSURLSessionWebSocketMessage?, @guaranteed Error?) -> () + 71
3   com.apple.Foundation          	0x00007fff39433ac5 __NSBLOCKOPERATION_IS_CALLING_OUT_TO_A_BLOCK__ + 7
4   com.apple.Foundation          	0x00007fff394339e6 -[NSBlockOperation main] + 80
5   com.apple.Foundation          	0x00007fff39433981 __NSOPERATION_IS_INVOKING_MAIN__ + 17
6   com.apple.Foundation          	0x00007fff39432bb3 -[NSOperation start] + 722
7   com.apple.Foundation          	0x00007fff394328d9 __NSOPERATIONQUEUE_IS_STARTING_AN_OPERATION__ + 17
8   com.apple.Foundation          	0x00007fff394327a9 __NSOQSchedule_f + 182
9   libdispatch.dylib             	0x00007fff70d5d658 _dispatch_client_callout + 8
10  libdispatch.dylib             	0x00007fff70d5fd79 _dispatch_block_invoke_direct + 245
11  libclang_rt.tsan_osx_dynamic.dylib	0x0000000102cbf30c __tsan::invoke_and_release_block(void*) + 12
12  libclang_rt.tsan_osx_dynamic.dylib	0x0000000102cbf062 __tsan::dispatch_callback_wrap(void*) + 306
13  libdispatch.dylib             	0x00007fff70d5d658 _dispatch_client_callout + 8
14  libdispatch.dylib             	0x00007fff70d62c44 _dispatch_lane_serial_drain + 597
15  libdispatch.dylib             	0x00007fff70d635d6 _dispatch_lane_invoke + 363
16  libdispatch.dylib             	0x00007fff70d6cc09 _dispatch_workloop_worker_thread + 596
17  libsystem_pthread.dylib       	0x00007fff70fb7a3d _pthread_wqthread + 290
18  libsystem_pthread.dylib       	0x00007fff70fb6b77 start_wqthread + 15

urlsession-crash.txt
urlsession-crash-network-traffic.chlsj.zip

@atdrendel
Copy link
Contributor Author

I ran the test suite against the in-progress NWConnection-backed websocket-apple branch. In order to make the tests pass consistently, I had to disable flaky tests, including testJoinRetriesWithBackoffIfTimeout(), testIsJoiningAfterJoin(), and one more I forgot (which I should have written down, of course). After doing so, though, the tests passed without an issue for over eight hours, at which point I just canceled the test run.

However, while running the tests, I saw "unexpected" messages being received when the WebSocket was not in the connecting or open states for the same tests that typically cause this crash when using URLSessionWebSocketTask.

Test Case '-[PhoenixTests.SocketTests testPushOntoSocket]' started.
onStateUpdate preparing
onStateUpdate ready
Fatal error: Unexpected message: [null,1,"phoenix","phx_reply",{"response":{},"status":"ok"}]: file /Users/atdrendel/shareup/websocket-apple/Sources/WebSocket/WebSocket.swift, line 178
2020-07-12 23:43:36.353530+0200 xctest[86982:469880] Fatal error: Unexpected message: [null,1,"phoenix","phx_reply",{"response":{},"status":"ok"}]: file /Users/atdrendel/shareup/websocket-apple/Sources/WebSocket/WebSocket.swift, line 178
Program ended with exit code: 9

@myobie
Copy link
Member

myobie commented Jul 14, 2020

@atdrendel that's great and it makes sense to me that we might receive a message during .closing. We should break out each possible case and output the variable it wraps to see if that's what is happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants