-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"My Documents" data fetch needs overhaul #91
Comments
I can reproduce the behaviour locally. When I refresh the "My Documents" pages. The client makes the following API calls.
All calls took less than a second except the documents one which took 15+ seconds. I also have gathered database logs for the page load. I stored them in a file called head -5 db.log
tail -5 db.log
The first and the last queries are about 17 seconds apart. It looks like it's making a lot of database queries at the moment. grep "statement: SELECT" db.log | wc -l
About 30% of the calls are related to the grep "FROM \"rpc_historical" db.log | wc -l
At a quick glance, they mostly look like the following. SELECT "rpc_historicalrfctobe"."id",
"rpc_historicalrfctobe"."is_april_first_rfc",
"rpc_historicalrfctobe"."rfc_number",
"rpc_historicalrfctobe"."external_deadline",
"rpc_historicalrfctobe"."internal_goal",
"rpc_historicalrfctobe"."disposition_id",
"rpc_historicalrfctobe"."draft_id",
"rpc_historicalrfctobe"."submitted_format_id",
"rpc_historicalrfctobe"."submitted_std_level_id",
"rpc_historicalrfctobe"."submitted_boilerplate_id",
"rpc_historicalrfctobe"."submitted_stream_id",
"rpc_historicalrfctobe"."intended_std_level_id",
"rpc_historicalrfctobe"."intended_boilerplate_id",
"rpc_historicalrfctobe"."intended_stream_id",
"rpc_historicalrfctobe"."history_id",
"rpc_historicalrfctobe"."history_date",
"rpc_historicalrfctobe"."history_change_reason",
"rpc_historicalrfctobe"."history_type",
"rpc_historicalrfctobe"."history_user_id"
FROM "rpc_historicalrfctobe"
WHERE "rpc_historicalrfctobe"."id" = 1
ORDER BY "rpc_historicalrfctobe"."history_date" DESC, "rpc_historicalrfctobe"."history_id" DESC
My understanding is that when a document is first inserted into the database, it generates a history entry with -- simplified version of the query above
SELECT "rpc_historicalrfctobe"."id",
"rpc_historicalrfctobe"."history_id",
"rpc_historicalrfctobe"."history_date",
"rpc_historicalrfctobe"."history_type"
FROM "rpc_historicalrfctobe"
WHERE "rpc_historicalrfctobe"."id" = 1
|
Sorry, I didn't mean it literally. What I meant to convey was that the document histories are very short right now compared to what they'll be when the system is deployed and documents are actually being processed. The "worrying" part is that, even with minimal history lengths it's already a near-dominant factor in the processing time. Thanks for doing the detailed investigation. I'm hopeful that the performance we're seeing now will improve enormously when we go from a quick-and-dirty "just get it going" implementation to something more mature. |
The "My Documents" page loads data to display by retrieving all the
RfcToBe
data and all assignments from the backend, then filters these on the client side. This is extremely slow (takes ~11 seconds on my dev machine with the migrated data set) and reveals data that should perhaps not be visible to every client.A substantial chunk of this time (around 75-80%) is spent computing history that is not actually shown on the view. (This is a bit worrying because there's essentially no history in the migrated data, but it's not time to panic about it.)
The text was updated successfully, but these errors were encountered: