You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I took a peak at your source code. One source for crawling issues is that you currently define in the code not_parseable_ressources. Instead, if you define parseable resources and limit those to only truly parseable resources that are are supported in the sitemap and may contain plain text html links, you can limit issues with unknown extensions. Also you might take a look at using mime types instead of file extensions. I am not sure how that works in Python though.
The text was updated successfully, but these errors were encountered:
I took a peak at your source code. One source for crawling issues is that you currently define in the code not_parseable_ressources. Instead, if you define parseable resources and limit those to only truly parseable resources that are are supported in the sitemap and may contain plain text html links, you can limit issues with unknown extensions. Also you might take a look at using mime types instead of file extensions. I am not sure how that works in Python though.
The text was updated successfully, but these errors were encountered: