-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing testdata files for unittests #13
Comments
0146_281.3B.tif |
Cc'ing @jbreiden. |
@stweil Do we still need more images/testdata from Google? |
I'm afraid that we have to find our own solutions without waiting for Google. They cannot provide all images and test data because some might be copyrighted. Therefore it is important to find free replacement images and data. We have nearly all images needed for the unit tests (equationdetect_test still needs an image). |
If you are looking for solutions to find free replacement images and data for use in unit testing, there are several options you can consider: Free Image Banks: There are several free image banks available on the internet, where you can find high-quality, public domain images to use in your tests. Some examples include Unsplash, Pixabay and Pexels. Test Data Databases: In addition to images, you may need test data for your test units. There are databases of test data freely available on the web that can be used to create realistic test scenarios. Search for open datasets related to your application domain. Creating Images and Test Data: If you are unable to find suitable images or test data, consider creating your own. You can create simple images using free image editing tools like GIMP or Paint.NET, and generate test data using random data generation libraries in Python like Faker. Community Resources: Don't underestimate the power of community. Search forums, discussion groups, and online communities related to your application domain. Many times, other developers are willing to share images and test data that they have created or found. Creative Commons Licenses: When searching for free replacement images and data, be sure to check usage licenses. Many free resources are available under Creative Commons licenses, which may have specific attribution requirements or commercial use restrictions. |
Thanks, but this issue is not about finding any image. It is about finding very specific images for a very specific task which is part of the unittests. |
To resolve this issue, you can follow these steps: Clearly identify which specific images are required for the test cases in question. Make sure these images are available somewhere accessible for testing. This could be in an internal image repository, a cloud storage server, or another accessible location. If images are not available, you may need to create or purchase the necessary images and ensure they are stored in a suitable location. After ensuring that the required images are available, you can update your unit tests to reference these specific images when running your tests. Be sure to clearly document the image requirements for each test case so future developers know which images are needed and where to find them. Rerun your unit tests to ensure that the images are being used correctly and that the tests are passing as expected. By following these steps, you should be able to solve the problem of finding the specific images needed for the test cases in your unit tests. |
I am sorry to say that, but your comments (and your pull requests) are not helpful. They sound like the result of an AI chat bot. If you want to help, you should read this issue carefully (it lists the missing images), look into the test code where these images are used and try to activate that code with replacement images. |
Ok |
testdata/lstm_training.txt is required for building training data for lstm_test
https://github.com/tesseract-ocr/tesseract/blob/master/unittest/lstm_test.cc#L6
// Generating the training data:
// If the format of the lstmf (ImageData) file changes, the training data will
// have to be regenerated as follows:
// ./tesseract/text2image --xsize=800 --font=Arial
// --text=tesseract/testdata/lstm_training.txt --leading=32
// --outputbase=tesseract/testdata/lstm_training.arial
// ./tesseract tesseract/testdata/lstm_training.arial.tif
// tesseract/testdata/lstm_training.arial lstm.train
// --pageseg_mode=6
The text was updated successfully, but these errors were encountered: