Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue with pagination and add Highlight Export and Daily Review endpoints #78

Open
Scarvy opened this issue Jun 26, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@Scarvy
Copy link

Scarvy commented Jun 26, 2024

Description

This is a suggestion to fix an issue with paginating results for the Highlight EXPORT API endpoint. In addition, requesting to add two new API endpoint method calls: Highlight EXPORT and Daily Review LIST. See the description below (pulled from Readwise API docs):

Highlight EXPORT - If you want to pull all of the highlights from a user's account into your service (eg notetaking apps, backups, etc) this endpoint is all you need!

Daily Review LIST - Returns your daily review highlights

I'd be happy to submit a PR to implement these changes, including updating documentation and tests if the configuration is approved.

Potential Configuration

Highlight EXPORT

I suggest fixing the pagination issue in the _get_pagination() method with a simple expression to check the endpoint parameter equals "/endpoint/", allowing it to paginate the pages correctly without affecting the other endpoints that use the old pagination technique. No new code is added as it's the same code as the _get_pagination() in ReadwiseReader class.

def _get_pagination(
   ...
) -> Generator[dict, None, None]:
  ...
  if endpoint == "/export/":
      # taken from `ReadwiseReader` class `_get_pagination` method
      pageCursor = None
      while True:
          if pageCursor:
              params.update({"pageCursor": pageCursor})
          logging.debug(f'Getting page with cursor "{pageCursor}"')
          try:
              response = getattr(self, get_method)(endpoint, params=params)
          except ChunkedEncodingError:
              logging.error(f'Error getting page with cursor "{pageCursor}"')
              sleep(5)
              continue
          data = response.json()
          yield data
          if (
              isinstance(data, list)
              or not data.get("nextPageCursor")
              or data.get("nextPageCursor") == pageCursor
          ):
              break
          pageCursor = data.get("nextPageCursor")
  else:
      # same code as before

Once that is fixed, it is possible to add a new generator method in the Readwise class to call the export endpoint. Below is my suggested implementation.

def export_highlights(
    self, updated_after: str = None, ids: list[str] = None
) -> Generator[ReadwiseExportResults, None, None]:
    """
    Export all highlights from Readwise.

    Args:
        updated_after: date highlight was last updated
        ids: A list of book ids
    Yields:
        A generator of ReadwiseExportResults objects
    """
    params = {}
    if updated_after:
        params["updatedAfter"] = updated_after
    if ids:
        params["ids"] = ",".join(_id for _id in ids)
    for data in self.get_pagination_limit_20("/export/", params):
        for book in data["results"]:
            book_tags = [ReadwiseTag(**book_tag) for book_tag in book["book_tags"]]

            highlights = [
                ReadwiseExportHighlight(
                    tags=[ReadwiseTag(**tag) for tag in highlight["tags"]],
                    **{
                        key: value
                        for key, value in highlight.items()
                        if key != "tags"
                    },
                )
                for highlight in book["highlights"]
            ]

            yield ReadwiseExportResults(
                **{
                    key: value
                    for key, value in book.items()
                    if key not in ["book_tags", "highlights"]
                },
                book_tags=book_tags,
                highlights=highlights,
            )

Daily Review LIST

These methods work with the existing codebase as it does not rely on pagination.

def get_daily_review(self) -> ReadwiseDailyReview:
    """Get Readwise Daily Review results.

    Returns:
        A ReadwiseDailyReview object
    """
    return ReadwiseDailyReview(**self.get("/review/").json())

def get_daily_review_highlights(
    self,
) -> Generator[DailyReviewHighlight, None, None]:
    """Get Readwise Daily Review highlights.

    Yields:
        A generator of ReadwiseDailyReview objects
    """
    daily_review = self.get_daily_review()
    for highlight in daily_review.highlights:
        yield DailyReviewHighlight(**highlight)
        

Dataclass models

I think adding these models will help with data management from the new methods: ReadwiseExportResults, ReadwiseExportHighlight, DailyReviewHighlight, and ReadwiseDailyReview.

References

I came across this issue because I use this package to get highlights for my Readwise to Apple Notes export CLI tool and noticed the export endpoint did not get all of my highlights. See issue at Scarvy/readwise-to-apple-notes#4.

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@Scarvy Scarvy added the enhancement New feature or request label Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant