-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add HTTP Last-Modified header for TEI resources #396
Conversation
modules/pages.xqm
Outdated
let $if-modified-since := try { request:get-attribute("if-modified-since") => parse-ietf-date() } catch * { () } | ||
let $should-return-304 := | ||
if (exists($last-modified) and exists($if-modified-since)) then | ||
$if-modified-since gt $last-modified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per rfc2616 clients SHOULD set If-Modified-Since to a previously received Last-Modified. So using ge
(greater or equal) instead of gr
(greater) here seems desirable. Otherwise most requests will still get a 200 instead of a 304.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agh2342 Thank you! I've now switched to ge
.
Address @agh2342's comment: > As per rfc2616 clients SHOULD set If-Modified-Since to a previously received Last-Modified. So using ge (greater or equal) instead of gr (greater) here seems desirable. Otherwise most requests will still get a 200 instead of a 304.
For some info on how Last-Modified was handled on an earlier incarnation of HSG, see https://markmail.org/message/j5dk2mlho6khxfpv. |
|
||
let $last-modified := | ||
if (exists($publication-id) and exists($document-id)) then | ||
pages:last-modified($publication-id, $document-id, $section-id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pages:last-modified($publication-id, $document-id, $section-id) | |
pages:last-modified($publication-id, $document-id, $section-id) => format-dateTime("[FNn,*-3], [D01] [MNn,*-3] [Y0001] [H01]:[m01]:[s01] [Z0000]", "en", (), ()) => parse-ietf-date() |
xmldb:last-modified() has millisecond resolution.
HTTP Last-Modified
and If-Modified-Since
headers have second resolution.
We need to make sure we mangle this the same way for outgoing and incoming HTTP headers and xmldb:last-modified() calls so comparisons work as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agh2342 I've applied the change and believe it correctly truncates milliseconds. To confirm the approach:
xquery version "3.1";
let $date := current-dateTime()
return
(
$date,
$date
=> format-dateTime("[Y0001]-[M01]-[D01]T[H01]:[m01]:[s01][Z]")
=> xs:dateTime()
)
This returns:
(
xs:dateTime("2022-02-03T12:30:10.395-05:00"),
xs:dateTime("2022-02-03T12:30:10-05:00")
)
I deployed the change locally, and the calculation appears to be correct.
In a new commit, I added functions to look up and generate the HTTP "Created" header, so that both the created date and last-modified dates are accurate. I did this because in practice the Created header that was automatically generated before was often later than the last-modified date. For example:
No logic on our end depends on the Created date, but it was unintuitive, so I thought it was worth applying the same approach we used for generating headers to this piece of information. Now the created date is accurate - so it will always be the same as or earlier than the last-modified date:
|
@agh2342 Thanks for your reviews and feedback! |
Extend #396 to previous & next navigation buttons on TEI resources
Extend #396 to previous & next navigation buttons on TEI resources
As discussed with @windauer and @agh2342, accurate dependable HTTP Last-Modified headers could aid in our efforts to improve efficiency and performance of our servers. This PR is an initial step toward that goal.
It adds the Last-Modified header to resources on history.state.gov that are generated from TEI sources. Specifically, it defines new
*-last-modified
entries inconfig.xqm
's$config:PUBLICATIONS
map, which thepages:load
templating function retrieves when constructing the TEI data needed on a page. (See Update below.) This function then sets a request attribute,hsgshell.last-modified
(modeled onhsgshell.error
), containing the last-modified date. Then, afterview.xql
finishes executingtemplates:apply
, it checks for the presence of this attribute and, if found, passes the value toapp:set-last-modified
, which formats the date according to RFC 1123 (see w3c/qtspecs#36) and sends the formatted date toresponse:set-header("Last-Modified")
.Besides TEI resources, this PR adds a Last-Modified header for FRUS volumes that we don't have TEI for yet, by using the date of the resource in
/db/apps/frus/bibliography
.Other than this one non-TEI case, the PR doesn't handle Last-Modified headers for any other non-TEI resource. These could be added in future PRs.
The functions that look up last modified dates do so purely by looking up the last-modified date of TEI documents using the
xmldb:last-modified
function. It doesn't consider the last-modified date of the primary or any other imported HTML templates or XQuery modules, which could carry significant changes (i.e., site-wide menus, section sidebars, etc.). More nuanced handling could be added in a future PR.The PR only handles direct requests to resources, not the ajax requests with prev/next buttons (i.e.,
frus-ajax.xql
).However, since FRUS TEI pages are some of the most taxing pages on the site (and AFAIK crawlers tend to make direct requests for resources rather than activating prev/next buttons), this PR should help us tell whether the addition of dependable Last-Modified headers is worthwhile.
In my local testing, the
Last-Modified
header is correctly set and updated as resources are stored in the database, and there are no changes to the functionality of non-TEI pages.Update: With 59008ca, the code now checks for an "If-Modified-Since" header in
pages:load
and, if a 304 is appropriate, it immediately halts further processing of the template—and instead returns a 304.