Skip to content

Recognition of text in images, speech in audio files, recognition of information in PDF!

Notifications You must be signed in to change notification settings

IvanGaideek/Application-toText

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Application: toText

License Static Bradge Static Badge Setup app --- alt text

Libraries that helped make the project❤️:

Static Badge Static Badge Static Badge Static Badge Static Badge Static Badge Static Badge

The main functions of the application:

  • Optical text recognition (OCR) from images.
  • Speech recognition from audio files.
  • Convert PDF files to text documents, extract images and tables

Installing the application

Link to the installer file archive: Yandex.disk

About the launch

Install the libraries using the command:

pip install -r requirements.txt

The file to run is main.py - main.py
Files with information about the application in the docs folder.
The settings are stored in JSON files, which are located in Settings

Additional information and notes:

The weight of the final application is ~288 Mb But during recognition, language sets and recognition models (in speech recognition) will be downloaded to your computer as needed. Unfortunately, version 1.1 has almost no GPU support. The ffmpeg package must be installed for speech recognition to work, if it is not installed.

About

Recognition of text in images, speech in audio files, recognition of information in PDF!

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages