Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Google Scholar API] Publications in Arabic not parsed correctly. #2225

Open
paholg opened this issue Dec 12, 2024 · 2 comments
Open

[Google Scholar API] Publications in Arabic not parsed correctly. #2225

paholg opened this issue Dec 12, 2024 · 2 comments
Assignees
Labels
status: queued Ready to work on type: bug Something is broken

Comments

@paholg
Copy link

paholg commented Dec 12, 2024

Publications in Arabic are not parsed correctly.

See this example; a paper from 2007 is returned in Arabic from Google Scholar. But SerpApi populates the publication with the author list, and doesn't populate the authors field at all. It's a little bit hard to see because the HTML renders the text right to left, whereas it's left-to-right in the JSON. But notice the ellipses on the first grey text in the HTML; this matches the ellipses on the publication field in the JSON.

2024-12-12_09 39 00-s

Public links: Playground Link

@paholg paholg added the type: bug Something is broken label Dec 12, 2024
@hilmanski
Copy link

Thank you very much, @paholg, for reporting this issue. We'll make sure to update you on this thread.

Inspect

@hilmanski hilmanski added the status: queued Ready to work on label Dec 12, 2024
@ishiharaf
Copy link

Thanks for reporting this issue @paholg. Please note this doesn't happen when the language is set to Arabic, so you can use it as a workaround until we fix it for other languages.

image

Playground Link

@ishiharaf ishiharaf self-assigned this Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: queued Ready to work on type: bug Something is broken
Projects
None yet
Development

No branches or pull requests

3 participants