Solved – How to find text blocks in a scanned document

I am trying to detect text in a scanned document by examining variations in the lightness of the scan collapsed vertically. Here's a sample of the input I would receive, with the lightness plot of each vertical pixel strip superimposed:

Example

Note: I've applied a Gaussian smoothing function to the data ~ 10 times, but it seems to be pretty wiggly to begin with. It is easy to see that the left margin is really wiggly (i.e., has many extrema).

Problem: I want to generate a set of critical points of the image.

I've resorted to computing the number of extrema of the function within an interval (using the derivative and its proximity to zero) and dividing that by the length of the interval, but that isn't easy on the computer. (I use Python, and I couldn't find many low-pass filters for the data.)

Thanks!

A moving standard deviation sounds like a reasonable thing to use… here is a toy example in poorly written untested poorly optimized pseudo-C, things may go out of bounds or not work as I expect, but you should get the general idea:

const int NPixelColumns; //The number of pixels columns const int WindowSize; //The size of the moving window for the standard deviation double BrightnessVals[NPixelColumns]; //Someplace to store your data initially int startIndex; //Where the moving window starts int lcv; //Generic loop control variable  for (startIndex = 0; startIndex++; startIndex < (NPixelColumns-WindowSize)) {    int endIndex = startIndex + (WindowSize-1);    double sum; //the sum of values in the windows    double xbar; //the mean in the window    double deltasq[WindowSize]; //the squared differences between the mean and the value    double SS=0; //the sum of deltasq    for (lcv = startIndex; lcv++; lcv <= endIndex)    {        sum += BrightnessVals[lcv];    }    xbar = sum/WindowSize;    for (lcv = 0; lcv++; lcv < WindowSize)    {        deltasq[lcv] = pow(BrightnessVals[startIndex+lcv]-xbar,2);        SS += deltasq[lcv];    }    printf("At step %i the moving SD is: %f", startIndex, SS/sqrt(WindowSize-1)); } 

In R this kind of thing is a snap:

sdwindow <- function(start,end,data) {     return(sd(data[start:end])) } nsamp <- 1000 #The number of samples to look over windowsize <- 10 #The size of the window to get the SD of x <- rnorm(nsamp) #Sample data start <- 1:(nsamp-windowsize) #starting points for the window end <- (windowsize+1):nsamp #ending points for the window doit <- Vectorize(sdwindow, vectorize.args = c("start","end")) #save me the trouble of figuring out mapply for the nth time. doit(start,end,x) #generate the result 

Similar Posts:

Rate this post

Leave a Comment