Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runaway GC in Safari\Webkit on refresh without disconnect() #398

Open
WalrusSoup opened this issue Aug 2, 2024 · 6 comments
Open

Runaway GC in Safari\Webkit on refresh without disconnect() #398

WalrusSoup opened this issue Aug 2, 2024 · 6 comments
Assignees
Labels
priority: medium This PR should be reviewed after all high priority PRs. status: accepted This issue is accepted as valid, but it's not being worked on yet. type: bug This issue reports a bug.

Comments

@WalrusSoup
Copy link

WalrusSoup commented Aug 2, 2024

Running into an issue where a simple pubnub chat causes Safari to eat up 90% CPU when any channels have been joined. I've isolated it to opening a connection to channels. As soon as I call connect, it will have this issue after refreshing. It wont happen every refresh, but it happens 90% of the time.

Safari Version: Version 17.5 (19618.2.12.11.6)

Prior to refreshing, a lot of errors pop up in the console (~10 times):
Fetch API cannot load [x] due to access control checks.

This occurs even in a simple boilerplate page with nothing on it. This is from a vanilla vite project compilation, with an empty page:

Chat.init({
  publishKey: import.meta.env.VITE_PUBNUB_PUB_KEY,
  subscribeKey: import.meta.env.VITE_PUBNUB_SUB_KEY,
  userId: "1111111",
  storeUserActivityTimestamps: true,
  }).then((chat) => {
  const channel = await chat.getChannel("50")
  // connect or join
  channel.join((msg) => {
    console.log(msg);
  });
});

I do not see the errors Fetch API cannot load [x] due to access control checks when reloading via the developer console.

muh_cpu.mp4

If you force the browser to disconnect before reloading, we don't see the same behavior:

window.onbeforeunload = () => {
  console.log('unloading!!!!', window.chatRef);
  window.chatRef.sdk.disconnect();
}

On the development side, since we're in a Webkit view, i cannot seem to fix it for a reload. But i can at least prevent it from happening during hot module reloads:

import.meta.hot.on('vite:beforeFullReload', () => {
  if(window.chat) {
    console.log('!!!!UNLOADING CHAT!!!!');
    window.chat.sdk.disconnect();
    console.log('!!!!SDK DISCONNECTED!!!!');
  }
});

However the webkit environment does not fire the events onbeforeunload etc when refreshing which means this bug is basically stuck.

On reload, it appears that the timers that were scheduled are simply running forever. I set up a setTimeout() to simply disconnect Pubnub and set the object to undefined and the timers still hang out forever.

If I load, then reload the window and call destroy() on the current instance:

  setTimeout(() => {
    console.log('Calling Destroy()');
    window.chatRef.sdk.destroy();
  }, 5000)

When we compare snapshots we see that Safari seems to be hanging onto the SDK and throwing partial GCs hundreds of times:

Screenshot 2024-08-02 at 9 43 24 AM

Likewise, when calling destroy() on the SDK underneath, we still hang onto memory because there is no destroy() function on the ChatSDK. I can manually clean up both and THAT appears to fix the issue. However, this does not work when reloading. This must be called before we reload the window.

  setTimeout(() => {
    console.log('Calling Destroy()');
    window.chatRef.sdk.destroy();
    delete window.chatRef;
  }, 5000)

Perhaps I am missing something here?

@parfeon
Copy link
Contributor

parfeon commented Aug 2, 2024

@WalrusSoup thank you for sharing information about this issue.
It looks like the WKWebView way of handling JS context is the reason for what is happening. There are two ways to solve it:

  1. Initialize new WKWebView instance: this will force JS contexts to be destroyed
  2. The one which you mentioned: window.onbeforeunload - this can be done differently:
    a. The user handles this case and destroys the current SDK client / Chat SDK instance
    b. If the web version of JS SDK used, it can add onbeforeunload handler and destroy itself.

For 2.b. I will need to speak with the team about whether it will affect anyone if we will destroy by default.

@WalrusSoup
Copy link
Author

WKWebView does not seem to reliably trigger onbeforeunload. Should this be a bug report to webkit then since it's so easily and reliably recreated?

I managed to get it functional by hitting it from all sides in our app - but jeeze. This doesn't feel great.

if (import.meta.hot && !window.importHook) {
  import.meta.hot.on('vite:beforeFullReload', () => {
    if (window.chat) {
      // @ts-ignore
      window.chat.sdk.destroy();
    }
  });
  window.importHook = true;
}

reload via hotkey:

if (!window.refreshHook) {
  document.addEventListener('keydown', async (e) => {
    e = e || window.event;
    if (e.keyCode == 116) {
      window.chat.sdk.destroy();
      window.location.reload();
    }
  });
}

context menu needs to be replaced with one that can hook reload, so we disabled it.

document.addEventListener('contextmenu', function (e) {
    e.preventDefault();
});

@fwebdev
Copy link

fwebdev commented Oct 24, 2024

Is there any progress on this?

Our Customers also run into this Issue. The Regression was introduced in Version 8.2.7.
When I look at the Diff there is not much that changed. How could that cause the Error?

We had to rollback because of Customer Complaints.

@tance77
Copy link

tance77 commented Nov 13, 2024

@parfeon Any updates on this?

@parfeon
Copy link
Contributor

parfeon commented Nov 17, 2024

@WalrusSoup as I mentioned on #396 that because of suggested fix (when browser forcefully closes connection) Web Fetch API doesn't provide much information about error, and we can only assume using error message (in most of the cases). To properly handle this case, subscription timeout should be configured for PubNub instance, so it will recreate idle request on its own before it will be forcefully terminated by the browser.

Fixes from #396 have been reverted in 8.2.9 and there is even newer version available already: 8.3.0

@parfeon parfeon self-assigned this Nov 17, 2024
@parfeon parfeon added status: accepted This issue is accepted as valid, but it's not being worked on yet. type: bug This issue reports a bug. priority: medium This PR should be reviewed after all high priority PRs. labels Nov 17, 2024
@WalrusSoup
Copy link
Author

WalrusSoup commented Nov 19, 2024

Our Fix:
We've patched this in our own project by listening for a Tauri navigation event. For browsers, right now, i would suggest listening to both pagehide and beforeunload and calling chat.sdk.destroy(true) or pubnub.destroy(true)

window.addEventListener('beforeunload', () => {
// yeet the sdk
});
window.addEventListener('pagehide', (event) => {
// yeet the sdk
});

The runaway GC still remains an issue as soon as any channel connection is opened. I suspect that we're looking at an upstream node-fetch issue.

What's even crazier is you can open the connection and refresh the page without connecting to pubnub and the page still goes into a GC spiral.

Screenshot 2024-11-18 at 4 06 39 PM

I loaded it the first time, commented out the pubnub items for the second run, then gave it a refresh. Safari still confidently holding on to a ton of functions from within the sdk.

Screenshot 2024-11-18 at 4 10 31 PM

And again it's still pretty trivial to reproduce:

import { Chat } from "../js-chat/lib/src/"

Chat.init({
    publishKey: import.meta.env.VITE_PUBNUB_PUB_KEY,
    subscribeKey: import.meta.env.VITE_PUBNUB_SUB_KEY,
    userId: "user_1111",
    subscribeRequestTimeout: 59,
}).then(async (chat) => {
  console.log('chat is init. im gonna refresh now i guess');
  const currentTimestamp = new Date().getTime();
  const secondUser = await chat.createUser(`user_${currentTimestamp}`, {
    name: `User ${currentTimestamp}`,
    profileUrl: "https://www.example.com/profile.png"
  });
  const { channel } = await chat.createDirectConversation({
    user: secondUser,
    channelId: `direct-user-${secondUser.id}-${currentTimestamp}`
  });
  channel.join((msg) => {
    console.log(msg);
  })
});

Also would like to point out - this bug will trip also from any navigation event away from the page when the back button is used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: medium This PR should be reviewed after all high priority PRs. status: accepted This issue is accepted as valid, but it's not being worked on yet. type: bug This issue reports a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants