Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record not followed by newline (conversion error) #140

Open
mw0000 opened this issue Feb 7, 2022 · 1 comment
Open

Record not followed by newline (conversion error) #140

mw0000 opened this issue Feb 7, 2022 · 1 comment

Comments

@mw0000
Copy link

mw0000 commented Feb 7, 2022

Hi, how to deal with such an error? I'm trying to convert a real old ARCs to use in SolrWayback

mw@webarch:~/solrwayback/indexing/warcs1$ warcio recompress test2.arc.gz test2.warc.gz
    WARNING: Record not followed by newline, perhaps Content-Length is invalid
    Offset: 52006972
    Remainder: b'http://www.omega.poznet.pl:80/rekin.html 212.126.5.228 200101211835 text/html 4274\n'
Recompress Failed: test2.arc.gz could not be read as a WARC or ARC
@ikreymer
Copy link
Member

ikreymer commented Feb 8, 2022

Can you share the ARC file that is causing the error? It may be using a format that was not supported so far..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants