Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API v5 status #450

Open
19 tasks
jan-niestadt opened this issue Sep 14, 2023 · 0 comments
Open
19 tasks

API v5 status #450

jan-niestadt opened this issue Sep 14, 2023 · 0 comments
Assignees
Labels
webservice Relates to BlackLab Server, our web service
Milestone

Comments

@jan-niestadt
Copy link
Member

jan-niestadt commented Sep 14, 2023

The current default on the dev branch (and for upcoming BlackLab 4.0), API v4, is transitional:

  • 98% compatible with v3
  • adds new /corpora/NAME endpoints

There's also the experimental future version, API v5. This only supports the /corpora/NAME endpoints introduced in v4 and removes the older /CORPUSNAME/... endpoints.

The new endpoints remove deprecated v3 stuff, changes some keys, some structures, XML more in line with JSON.

Use api=exp to check that your code works with API v5 and you don't rely on v3/4 endpoints or features.
You can also use api=3 to get the best compatibility with older BlackLab (although you likely don't need this as the changes (other than additions) between 3 and 4 are very minor)

Don't rely on any of the new API stuff until BlackLab 4.0 is released.

Also see https://github.com/INL/BlackLab/tree/dev/site/docs/development/api-redesign#api-support-roadmap (TODO: update that doc, add points here as well)

TODO API v5 URGENT:

  • XML: When using usecontent=orig, don't make the content part of the XML anymore.
    (escape it using CDATA (again, same as in JSON). Also consider just returning both
    the FI concordances as well as the original content (if requested), so the response
    structure doesn't fundamentally change because of one parameter value)
    (optionally have a parameter to include it as part of the XML if desired, to simplify response handling?)

TODO API v5 IMPORTANT (v4.0 or v4.1):

  • Update API documentation
  • make subcorpusSize and includetokencount more orthogonal
  • input-format, corpus-status(see proxy) displayName, description, etc. in custom block
  • include API v5 (/corpora/...) link in API v4 responses, e.g. in a apiV5Url field.

TODO API v5 NICE:

  • Start a proxy-v5 branch to port the proxy to (only?) support the new API (hopefully getting rid of a lot of hacks)
  • "fake" indexname parameter should be renamed to corpus (for API v5 only).
  • /corpora/ should return list of corpora.
  • grouping statistics: enable sort by several calculated properties (e.g. various relative frequencies, e.g. hits / tokens in matching docs) that are currently done in the frontend, which doesn't necessarily have the full results set.
  • Full JSON API:
    • sort/group properties should also be encodable as JSON? (both in parameter and response?)
    • Full query (patt+filter+sort+group+sample)
    • Full JSON request?
    • API v5 response uses JSON everywhere? (e.g. group criteria, group identity, etc.)
  • include links to related resources:
    • include (relative?) URLs to linked resources: corpora, fields, doc info/contents, prev/next page

MAYBE:

  • More consistency between parameters, response and configuration files. (e.g. use the term hitLimit everywhere for the same concept)
  • DataStream classes other than Solr should detected mismatches like startEntry/endItem too!
  • summary.pattern.corpusql often includes superfluous parentheses, because it's fairly tricky to figure out which ones are required (depends on operator precedence and associativity). It would be nice if the corpusql was always as clean as possible.
@jan-niestadt jan-niestadt self-assigned this Sep 14, 2023
@jan-niestadt jan-niestadt added the webservice Relates to BlackLab Server, our web service label Sep 14, 2023
@jan-niestadt jan-niestadt added this to the Parallel corpora milestone Feb 8, 2024
@jan-niestadt jan-niestadt modified the milestones: Parallel corpora, v4.0 Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
webservice Relates to BlackLab Server, our web service
Projects
None yet
Development

No branches or pull requests

1 participant