The tesseract-ocr organization on GitHub focuses on developing the Tesseract Open Source OCR Engine, which is widely recognized in the field of optical character recognition. It hosts a variety of repositories, primarily using languages such as C++, Python, HTML, and Ruby, including notable projects like tesseract and tessdata, which provide essential resources for users and developers.
Tesseract Open Source OCR Engine (main repository)
Trained models with fast variant of the "best" LSTM models + legacy models
Tesseract documentation
Best (most accurate) trained LSTM models.
Source training data for Tesseract for lots of languages
Train Tesseract LSTM with make
Fast integer versions of trained LSTM models
Various documents related to Tesseract OCR
Data used for LSTM model training
Tesseract documentation
Tesseract Config files
Repository for tesseract testing
User contributed (non Google) OCR models for Tesseract
Tesseract source code and API documentation
Tesseract-ocr builds primarily on C++ for its core OCR engine, alongside Python for training scripts and Ruby for documentation. This diverse use of languages supports a range of functionalities in optical character recognition.
The primary languages used by tesseract-ocr include C++, Python, HTML, Ruby, Makefile, and Shell. This combination facilitates the development and documentation of their OCR projects across different platforms and use cases.
Yes, all repositories under the tesseract-ocr organization are public on GitHub. This transparency allows users and contributors to access, review, and collaborate on various projects related to optical character recognition.
Monitor tesseract-ocr with RepoGuard and get alerted the moment a new public repository appears.
Monitor this account