Skip to content

kodezy/kocr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KOCR

Digit OCR (0–9) using OpenCV and K-Nearest Neighbors. Label your own images, train a classifier, and recognize digits in new images.

Demo

Collect (labeling) Predict (results)
Collect labeling interface Predict results

Requirements

  • Python 3.10 or newer
  • A graphical display — collect and predict open OpenCV windows for interactive labeling and visualization
  • macOS, Linux, or Windows with GUI support (headless/SSH-only environments will not work for interactive commands)

Installation

pip install -r requirements.txt

Quick Start

Use an image containing clear digit characters (printed text, screenshots, etc.).

1. Collect training data

Label digits interactively:

python main.py collect my-image.png
# or label all images in a folder:
python main.py collect ./images/

Controls:

Key Action
09 Label current contour as that digit
SPACE Skip contour (not a digit)
BACKSPACE Undo last label
ESC Quit and save progress

Output files (default):

  • models/generalsamples.data
  • models/generalresponses.data

2. Train the model

python main.py train

Output file (default): models/model.yml

Options:

  • --samples PATH — input samples file (default: models/generalsamples.data)
  • --responses PATH — input labels file (default: models/generalresponses.data)
  • --model PATH — output model file (default: models/model.yml)

3. Predict digits

python main.py predict my-image.png

Options:

  • --threshold FLOAT — confidence threshold 0.0–1.0 (default: 0.0)
  • --no-display — skip OpenCV visualization windows
  • --model PATH — model file (default: models/model.yml)

Workflow

collect → train → predict

All generated files are saved in models/ by default (gitignored except .gitkeep).

Project Structure

kocr/
├── main.py              # Entry point
├── src/
│   ├── cli.py           # Command-line interface
│   ├── collect.py       # Interactive data labeling
│   ├── train.py         # KNN model training
│   ├── predict.py       # Digit recognition
│   └── constants.py     # Shared configuration
├── models/              # Generated data and models (gitignored)
├── docs/assets/         # README screenshots
├── tests/               # Automated tests
├── pytest.ini
└── requirements.txt

Limitations

  • GUI requiredcollect always needs a display; predict needs one unless --no-display is set
  • No pre-trained model — you must label and train on your own images
  • Image-dependent accuracy — low contrast, noise, or unusual fonts may require tuning parameters in src/constants.py

Development

pip install -r requirements.txt
python -m ruff check src/ main.py tests/
pytest

License

MIT — see LICENSE.

About

OCR using K-Nearest Neighbors

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages