Pages: [1]   Go Down

Author Topic: How to extract the handwritten characters from table boxes?  (Read 2950 times)

afaf9999

  • Newbie
  • *
  • Offline Offline
  • Posts: 3
How to extract the handwritten characters from table boxes?
« on: September 09, 2009, 09:40:33 AM »

Hi to all

I have graduate project, and the problem of it is:
How to extract the handwritten characters from table boxes?
For example this is the original picture:


And I want it to be as this:


I suggested in the project proposal to use Projection Profile method (X-Y tree) or Hough Transform method to find the straight lines in the image which are (table boxes) and remove them.

I have tried to apply Projection Profile method (X-Y tree) in MATLAB but I didn't find code or algorithm for it. Then I tried to apply Hough Transform method, but I didn't success totally with it, also I don't know how to delete the straight lines after found them?

So, this is my problems that i wish find solutions for them with anyone can help.

Thanks.

ALI.
Logged

BernardLanguillier

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 9757
    • http://www.flickr.com/photos/bernardlanguillier/sets/
How to extract the handwritten characters from table boxes?
« Reply #1 on: September 09, 2009, 10:29:45 AM »

It would be very easy to find an algo if there were no intersection between the grid and the characters.

Still, it shouldn't be too hard to find measurable characteristics for the pixels belonging to the grid vs those belonging to characters, knowing that:

- the grid is regularly spaced,
- it is made up of adjacent pixels that are close to being black,
- the grid is surrounded by a mostly white area,
- ...

Cheers,
Bernard

Logged
A few images online here!

tomrock

  • Full Member
  • ***
  • Offline Offline
  • Posts: 243
    • http://tomrockwell.com
How to extract the handwritten characters from table boxes?
« Reply #2 on: September 09, 2009, 11:14:25 AM »

Make the boxes a color that the scanner won't see.
Logged

afaf9999

  • Newbie
  • *
  • Offline Offline
  • Posts: 3
How to extract the handwritten characters from table boxes?
« Reply #3 on: September 09, 2009, 01:59:40 PM »

thanks for replying,
actually the idea of the project is how to extract the characters from the grid even though overlapping or intersection between them, the two objects have same color which is black , there for my solution was finding the straight lines , because the handwritten characters can not be straight.  
but my problem is how to apply this solution?

Quote from: BernardLanguillier
It would be very easy to find an algo if there were no intersection between the grid and the characters.

Still, it shouldn't be too hard to find measurable characteristics for the pixels belonging to the grid vs those belonging to characters, knowing that:

- the grid is regularly spaced,
- it is made up of adjacent pixels that are close to being black,
- the grid is surrounded by a mostly white area,
- ...

Cheers,
Bernard
Logged

afaf9999

  • Newbie
  • *
  • Offline Offline
  • Posts: 3
How to extract the handwritten characters from table boxes?
« Reply #4 on: September 09, 2009, 02:00:59 PM »

it suppose to be same color (black)

Quote from: tomrock
Make the boxes a color that the scanner won't see.
Logged

BernardLanguillier

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 9757
    • http://www.flickr.com/photos/bernardlanguillier/sets/
How to extract the handwritten characters from table boxes?
« Reply #5 on: September 10, 2009, 12:49:06 AM »

Quote from: afaf9999
thanks for replying,
actually the idea of the project is how to extract the characters from the grid even though overlapping or intersection between them, the two objects have same color which is black , there for my solution was finding the straight lines , because the handwritten characters can not be straight.  
but my problem is how to apply this solution?

Got it, but it shouldn't that hard.

Just scan from outside the frame, assume that the first pixel found belong to the frame, then move on from there, identify straight segments (start point, lenght and direction), deduce from the statistically the step of the grid, and that should get you the pixels.

You'll have to deal with the width of these lines too...

Cheers,
Bernard
Logged
A few images online here!

papa v2.0

  • Full Member
  • ***
  • Offline Offline
  • Posts: 206
How to extract the handwritten characters from table boxes?
« Reply #6 on: September 15, 2009, 10:11:48 AM »

hi

have you tried  'edge' function in matlab?

[g] =edge(f,'sobel',T,dir);

g is a logical image map, T is threshold   f is input image and 'sobel' is the edge detector, dir is direction (vertical, horizontal or both(default))

assuming your image is square in the first place
Logged

Jonathan Wienke

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 5829
    • http://visual-vacations.com/
How to extract the handwritten characters from table boxes?
« Reply #7 on: September 15, 2009, 07:25:37 PM »

What OP is probably really interested in is figuring out a better way to remove obscurations from CAPTCHA images so that spambots can more easily register themselves on forums...

dkekesi

  • Newbie
  • *
  • Offline Offline
  • Posts: 5
    • http://www.kekesi.com
How to extract the handwritten characters from table boxes?
« Reply #8 on: September 27, 2009, 06:23:23 PM »

While not strictly fits the profile of this site, document imaging applications can solve your problem easily and much more. I work with Kofax Capture (www.kofax.com) as a solution provider (at adifferent part of the world, so I may not be of further use). It can do what you seek and even recognize handprinted characters. This is a standard task for that software. It is not cheap, but does the job fine. Kofax has resellers all over the world who will be more than glad to help you.
Logged
Best Regards,
Dániel Kékesi
www.kekesi.com

Brad Proctor

  • Full Member
  • ***
  • Offline Offline
  • Posts: 150
How to extract the handwritten characters from table boxes?
« Reply #9 on: September 27, 2009, 07:46:54 PM »

Quote from: Jonathan Wienke
What OP is probably really interested in is figuring out a better way to remove obscurations from CAPTCHA images so that spambots can more easily register themselves on forums...

lol, I'd guess your right.
Logged
Brad Proctor
Pages: [1]   Go Up