Quite the Character – Character Segmentation in the Brower – Typography



Quite the Character – Character Segmentation in the Brower – Typography

0 0


canvas_presentation


On Github deanmarano / canvas_presentation

Quite the Character

Character Segmentation in the Brower

by Dean Marano

Typography

Why is OCR so hard?

Fonts

Common Approaches

  • Character Level Properties (Classical)
  • Character Level Recognition
  • Word Level Recognition

Post Processing

  • Spell Checking
  • Grammar Checking

HTML5 Canvas

Advances in HTML now allow pixel level access to images, allowing image processing to be done in the browser.

Why use the browser?

The browser allows for a standard, license free environment that anyone can use an access on a variety of devices.

HTML5 Demo

My Strategy - Segmentation By Whitespace

Demo

Finding Characters in Words

Question: Can we use whitespace segmentation again?

Answer: Sometimes...

Creating a Baseline

For each character we wish to identify, we draw it to the screen and create two histograms. We use the same segmentation to narrow the whitespace around the character as much as possible before taking the histogram.

Future Work

  • Comparing Waveforms
  • Skewing

THE END