Robust Text Detection with Edge Enhanced MSER


Sorry for the shilling, but here’s my upcoming project:

Please register your email address if you’re interested in it.


This is implementation of Chen, Huizhong, et al. “Robust Text Detection in Natural Images with Edge-Enhanced Maximally Stable Extremal Regions.” [1]. Partly based on the sample available on Matlab [2].

Partly motivated by the fact that the example given from the Matlab site [2] contains helperGrowEdges and helperStrokeWidth functions which sources are sadly only available for owners of the newest Matlab. Thus I decided to implement those functions based on the literature and some of my own assumptions (e.g. pruning only the 8 neighbours, etc).

After reading thru the original literature [1], here’s how the algorithm works in a nutshell:

  • Create Maximally Stable Extremal Regions (MSER) as basic region candidate for the text. The reason is that MSER is fairly robust to view point, lighting, and scale changes. It is also based on the assumption that texts in real life are normally distinct enough from the background.
  • However, since MSER is sensitive to blurring (thus some regions might spill over to neighbouring ones), an Edge-Enhanced MSER is introduced to counter against that issue. It’s basically MSER region with pruning around the edge pixels along the gradient direction.
  • Additional geometric filtering is applied, in my case a connected component labelling is used, in order to remove blobs that don’t match the given criteria (e.g. min area, eccentricity, etc)
  • Create a distance transformed matrix from the Edge-Enhanced MSER
  • And compute stroke width image from the distance transformed matrix. This is achieved by propagating the maximum value around the ridges of the matrix, to neighbours which have lower values recursively.
  • Lastly, the resulting connected components are filtered based on the ratio of component’s standard deviation and mean values.

Here’s the sample original image:


And here’s the sample result, with the stroke width image on the right hand side, and the Tesseract-deciphered texts placed next to the detected candidate text region.Screen Shot 2014-06-10 at 14.47.05

My code is available here: It requires C++11, OpenCV, and Tesseract libraries.


[1] Chen, Huizhong, et al. “Robust Text Detection in Natural Images with Edge-Enhanced Maximally Stable Extremal Regions.” Image Processing (ICIP), 2011 18th IEEE International Conference on. IEEE, 2011.

[2] Automatically Detect and Recognize Text in Natural Images 

7 thoughts on “Robust Text Detection with Edge Enhanced MSER

  1. Rodrigo

    Awesome work! I’ve barely started getting my hands dirty with Octave’s vlfeat package. The mser filter they provide works marvelous, but as you’ve stated in this post, matlab has those two undocumented commands. I’ve been “trying” (more like blindly hacking) to come up with a pure Octave solution but to no avail. Have you tried taking a stab at this problem in the Octave domain? Best,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s