Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement] Only Google Results. #8

Open
patyarishimai opened this issue Oct 16, 2024 · 4 comments
Open

[Improvement] Only Google Results. #8

patyarishimai opened this issue Oct 16, 2024 · 4 comments

Comments

@patyarishimai
Copy link

The Araa metasearch engine is the single best google scraper, from all the metasearch engines available, in terms of google results accuracy, and I find it a downside that it sometimes needs to rely on qwant.

I suggest a method, perhaps based on the resolvers or captcha-solvers used by projects like SearXNG and 4get, to get only results from the Google engine. Sadly I am in no position to recommend better solutions, but I hope, at least, we rely on qwant results to a bare minimum.

@TEMtheLEM
Copy link
Owner

Historically a captcha solver was once supported in the past (very early this year in fact), but for reasons I can't completely remember atm, it was dropped. This decision was made by the project's lead @Extravi.

The feature could possibly be added be added back, but there'd need to be a discussion about this with the project leader. I'm on board with the system coming back as an option for instance maintainers if they only want to use google, but I also don't want to make my branch too different to upstream, so adding this to my branch as an extra patch is debatable.

The next best option is to try finding other alternative engines to use before using qwant. Obviously this isn't a perfect solution, but if you have any more preferable fallback engines in mind, feel free to drop some names so I could look into them.

@patyarishimai
Copy link
Author

I must presume the project's lead had a reason in regards with the performance of Araa for dropping the captcha solver out, nevertheless I would like to request it to be added back again as I think it could truly help with all the requests made in the public instances unless, of course, @Extravi stated a reason against it.

In any case I think the best fallback engine is Startpage as it is a google clone. Perhaps Presearch? As far as I know it uses google results as well.

Lastly I truly appreciate the support you both give to the project! Thank you so much @TEMtheLEM.

@Extravi
Copy link

Extravi commented Oct 22, 2024

It wasn't reliable because once Google blocks the instance, even if you solve the captcha, you would only get unblocked for less time, as suppose if you didn't get served to capture in the first place. So basically you would get blocked more often and served more captchas because it would invalidate your instance cookie more often, making it unreliable.

@patyarishimai
Copy link
Author

@Extravi that's a let down because I thought it might help. Have you thought about a method for relying less on alternate search engines and just focus on Google? As I have said: the UI, results and performance are the best of the best, but it all goes down when it switches to Qwant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants