Amiga.org
Operating System Specific Discussions => Amiga OS => Amiga OS -- Development => Topic started by: Karlos on January 20, 2005, 01:56:58 AM
-
Hi,
I'm looking for an algorithm that can look at a bitmap image and find bounding rectangles for any closed sets (eg a solid blob, of arbitary shape) of pixels that match a particular criteria, such as colour etc. I don't need to find the exact shape, just the bounding rectangle.
For example, suppose we had an image containing 3 non overlapping ovals of a particular colour we are interested in. The algorithm needs to return the 3 rectangles that contain the oval shapes.
I do have an algorithm already that needs to make a pair of line by line scans across the bitmap (first vertically, then horizontally, finally correlating the two scans) but it is a bit slow, even with the various early-out optimisations you can make to the line-scanning.
Any takers?
-
Why not take a 2D FFT and the corrolate that against a series of FFTs of every conceivable shape and then ...
Hmmmm.
Your way is probably faster and more possible in a finite amount of time.
-
Thinks a bit more...
Are these areas allowed to be "hollow" e.i. can they enclose a set of out of range pixels?
And how do you start looking? Do you start in the shape, as if you were doing a paint fill, or are you just looking for pixels that satisfy the criteria within the entire pixmap?
(So, could there be more than one match.)
And, if you scan one way, haven't you already scanned the other? Really?
Like if you walk the Ys and you know the x where you hit the good stuff and the x where you left.
And you could get faster if you took those start and ends and went straight down from there to the next Y
and if you are on the right side (for e.g.) and you go down and your pixel is in range, then go right further to find the edge, else go left.
For the other side, vice versa. If your left and right get to be the same, you've closed out the bottom.
Then from your starting Y go back up the same way to close your top.
-
re-reading, I see you've answered one of my questions.
Anyway, I don't think what I said would catch convex shapes coz you'd have to go back on yourself when they fold, but it's pretty close to a generic fill algo and there's a bunch of them around.
The amount of time it takes to finish depends upon the shape, but your method should be the same amount of time for anything.
-
By the way, I should thank you for helping me boost my post count which has been languishing of late.
-
Karlos wrote:
Hi,
I'm looking for an algorithm that can look at a bitmap image and find bounding rectangles for any closed sets (eg a solid blob, of arbitary shape) of pixels that match a particular criteria, such as colour etc. I don't need to find the exact shape, just the bounding rectangle.
For example, suppose we had an image containing 3 non overlapping ovals of a particular colour we are interested in. The algorithm needs to return the 3 rectangles that contain the oval shapes.
I do have an algorithm already that needs to make a pair of line by line scans across the bitmap (first vertically, then horizontally, finally correlating the two scans) but it is a bit slow, even with the various early-out optimisations you can make to the line-scanning.
Any takers?
I had to make something similar when I was trying to make GUI in SDL.
I had to look for all the areas where the "windows" over lapped and then make sure those areas were not updated.
I had to do it Vertically and then Horizontally (For every window, starting at the one below the top most one) too, I thought it would be slow, but it's not... on any of my PC's... I wouldn't want to run it on my 68K's or even my PPC :-/
-
Hi,
First of all, I should point out that this would be a pre-processing step, so it isn't critical speed wise. I just don't like slowness in general.
What is important is that the rectangles generated are as tight fitting as possible, as their content is used for a realtime application.
FluffyMcDeath wrote:
Thinks a bit more...
Are these areas allowed to be "hollow" e.i. can they enclose a set of out of range pixels?
It's not that complex. Consider the letter C - the algorithm would return the smallest rectangle that the complete letter fits into. A letter O of the same basic size would return the same rectangle.
The shapes are allowed to be hollow, just like the letter O in the above example. As long as the outer boundary is enclosed in the box, that is all that matters.
And how do you start looking? Do you start in the shape, as if you were doing a paint fill, or are you just looking for pixels that satisfy the criteria within the entire pixmap?
(So, could there be more than one match.)
What the existing algorithm does is to first scan down vertically a line at a time. It starts out witth the assumption that we haven't hit an area.
The moment we find a pixel along the current scanline that matches our criteria, the y coord is noted as a 'starting Y ordinate' (yMin), and the method of scanning now changes to look for the next scanline line that doesn't contain a pixel matching our requirements. When one of those is found, the y value is noted as a 'terminating Y ordinate' (yMax).
If we get to the end of the bitmap, it is always terminated as a yMax if we happen to be in an area of "the good stuff".
And so the scan proeeds. At the end, we have a set of yMin/yMax pairs.
We then do a horizontal scan in the same way to locate the corresponding xMin / xMax pairs.
We can correlate these to get a list of rectangles. It isn't perfect in that suppose we had two circles - one large and one small, with their y centres the same but at non-overlapping values of X.
The bounding rectangle for the larger one will be perfect, but the bounding rectangle for the smaller one will be as tall as the larger one. However, this is really a trifling inconvenience for what I have in mind.
And, if you scan one way, haven't you already scanned the other? Really?
Like if you walk the Ys and you know the x where you hit the good stuff and the x where you left.
This is what I was thinking, in a way.
And you could get faster if you took those start and ends and went straight down from there to the next Y
and if you are on the right side (for e.g.) and you go down and your pixel is in range, then go right further to find the edge, else go left.
For the other side, vice versa. If your left and right get to be the same, you've closed out the bottom.
Then from your starting Y go back up the same way to close your top.
Hmm. So you are saying we could basically floodfill the areas when we find them and keep track of the xMin/xMax and yMin/yMax during the fill. That might be a good solution.
-
FluffyMcDeath wrote:
By the way, I should thank you for helping me boost my post count which has been languishing of late.
My pleasure :-D
-
bloodline wrote:
I had to make something similar when I was trying to make GUI in SDL.
I had to look for all the areas where the "windows" over lapped and then make sure those areas were not updated.
I had to do it Vertically and then Horizontally (For every window, starting at the one below the top most one) too, I thought it would be slow, but it's not... on any of my PC's... I wouldn't want to run it on my 68K's or even my PPC :-/
You did this for rectangular windows? How come you never used a straightforward layered rectangle clipping approach for that? It would be orders of magnitude more efficient than scanning pixels.
My problem is that I'm dealing with irregular blobs so I have to do this.
-
Karlos wrote:
bloodline wrote:
I had to make something similar when I was trying to make GUI in SDL.
I had to look for all the areas where the "windows" over lapped and then make sure those areas were not updated.
I had to do it Vertically and then Horizontally (For every window, starting at the one below the top most one) too, I thought it would be slow, but it's not... on any of my PC's... I wouldn't want to run it on my 68K's or even my PPC :-/
You did this for rectangular windows? How come you never used a straightforward layered rectangle clipping approach for that? It would be orders of magnitude more efficient than scanning pixels.
No, I didn't scan pixels :-), but I had two loops. one that divided the windows horisontally, and one that did it virtically, which at the end cuts the windows up into visible and invisible rectangles.
I was just saying that the separating out the H and the V is not a bad thing in my mind.
My problem is that I'm dealing with irregular blobs so I have to do this.
What about colour? do you always know what colour the background is going to be?
I'm wondering if you can't use some collision detection thingie... :-/
-
The background colour is not really known.
All you really know, is that within the image, there will be irregular splodges that have a value within your target tolerance. All around them are pixels simply outside your tolerance. There are no areas of flat colour for the background.
-
If your shape is in ONE piece then you can
For (say) a Screen 640x480
scan vertical lines separated by 80 pixels
(y=0 80 160 240 320 400)
if nothing hitted
continue with
(y=1 81 161 241 321 401)
etc...
When a line hitted say 241 then finish 319 -> 242
Idem in X
:-)
Alain
-
@ Karlos
I don't see why you need to scan vertically, you should be able to get x and y values by scanning horizontally.
Also by scanning vertically you'll get a potentially massive speed hit as reading from RAM non-sequentially is very slow.
Assuming the index is in the top left:
Scan across, and when you hit the first correct pixel record xMin, xMax, yMin and yMax.
Scan the next line, if the pixel's x value is lower and it's touching the xMin you can record it as a new xMin, and increment yMax while there are other releant pixels touching you set a higher xMax.
--
That should work for solid areas but you'll need to implement something to track "previous pixels which were touching" if you want to capture non-solid objects.
That useful?
EDIT: I was suggesting an area fill for an alternative but this will not work on irregular shapes (i.e. an S).
-
@Minator
When I said "scanning vertically" I meant scanning lines in the Y direction, rather than actually scanning reading down a column of 1-pixel. I guess that isnt very obvious :lol:
To be clear, I read successive scanlines (each one at am increasing y) first, to determine the yMin / yMax boundaries. This amounts to reading along a horizontal line until we first hit something. This is our first yMin.
The thing is, due to the irregularity of the shapes, you can't say "this is also your x-min" because it simply isn't possible to know - unless it occurs at x=0, that is. The next scanline might have a pixel of the same criteria some pixels to the left of the one on the previous line. You would need a quite sophisticated mechanism for tracking all the x values you uncovered in the previous scanline and working out which ones correspond to the equivalent x values on the current scanline.
By using 2 scans, we can concentrate on resolving only one set of extrema for each pass. The idea works but it's just trying to find a more optimal approach. As you say, it's scanning the successive vertical lines (each one at an increasing value of x) that takes a hit.
Of course, a single scan of the entire thing does uncover all the information but it is not easy to resolve it without using a memory of some kind. The two scan approach basically forgoes this.
-
@Karlos
Just another silly idea...
Could there be more than one object you're trying to identify in the same horizontal space? If not you could use some form of prediction about where the start of the object is based on the start of the previous line (after you detect the start of an horizontal line that is...).
-
Hi Jose,
Yes. It's basically the case that there will be several such shapes in any given 2D area. The idea is to find the smallest bounding boxes around them, within a tolerance threshold.
The problem is still open ended, but I'm also looking into another solution which may be more useful, especially for awkward shapes.
-
Here's an image to make the problem more visual:
(http://secure.tillpoint.com/_rsc/blobs.gif)
The blue areas are outside our tolerance, the grey areas inside. The white rectangles are the areas I wish to find. Note from the diagram that:
A bounding rectangle may contain more than one 'blob', provided it properly fits the largest one.
A blob may be 'hollow' - I am only interested in the box around the outer extent.
-
Hey. Since the blobs can repeat themselves in a line I think all pixels have to be scanned. So the thing would be to opimize the detection routine. I know you more than know this, it's just that the program I'm doing has a similar routine;) Except that instead of trying to identify pixels that fall within a certain range it compares pixels of two successive images. I'll let y'all know what it does when I get a good, completed working version. I had a doubt that I think applies here too:
When comparing two pixels and checking the difference is there a way to use LONGs instead of BYTES to access compare and infere about the diference of each color component between 2 pixels? That would probably acceleratte it by 3 times or so..
I want to use C for now to make it portable to AOS4 so no 68k assembler (just for now :-D ).
Sorry for hijacking the thread :-D but I think this could be usefull for the code that makes the pixel comparisons used in your detection routine too, it's practically the same I think, except that you're using a range (two pixel comparisons probably for the higher and lower values ?).
[EDIT] I meant, in your case it could be two pixel comparsions(higher and lower range value checking) each for each color component.
-
@Jose
Are your pixels arranged as 8-bit bytes? If your data reside in copyback memory, you will find that on 040+ bytewise comparison is not such a big issue. It is still slower but, not as bad as you might expect.
However, you can compare 2 sets of 4x8-bit pixels or 2 sets of 2x15/16-bit pixels using longwords.
If you exclusively or the two longwords together, you will highlight any bits that are different in each set of pixels. All the bits which are the same in both sets of pixels will be zero. Depending on what you intend to do with the data, I'm not sure if this will help you or not.
Oh - if you need to know which pixels are different, there are a couple of ways of working it out. Probably the easiest is to take the result of the 32-bit xor operation and mask it with four succesive 32-bit masks that each isolate a byte in the long word. Anything coming up as non zero is a difference.
-
>I don't see why you need to scan vertically
All right i said vertically but you can scan first horizontally : but my idea was to not scan each line after
another (but skip N lines) so IF the shape is at the bottom (or right) of the screen you hit at first time only in several scan lines not by scanning all lines (or colonnes)
>Also by scanning vertically u arereading from RAM non->sequentially is very slow.
Absolutely right :-)
BTW with Altivec u can do massive compare of datas
some instructions on UBYTEs
vavgub Average 16 UBYTE
vcmpgtsb Compare Greater-Than of 16 BYTE
vmaxub Maximum of 16 UBYTE
vminub Minimum of 16 UBYTE
etc...
Alain :-D
-
@woof
That's interesting to know - pity I dont have altivec :-(
Well, not yet ;-)
As it goes, this doesn't help my immediate problem - which is better explained by the image than anything I can easily write :-)
Mind you, course all I am looking for is to speed up something that isn't really time critical (or at least not yet anyway - that might change).
-
I cant say that I understood all in here, but that won't stop me from
adding my "mustard" ;)
I would create 2 arrays of "struct blob" (minx,maxx,miny,maxy), both
big enough to hold the maximum amount of blobs I expect per line
(+safety offcoure, lets say struct blob even[20] and odd[20]).
I would now start with line 0, and eachtime I get a hit I fill in
the mins into one odd blob.
I'll then search for the end of the blob in
this line and and add the maxx. Maxy is set to the same as miny.
Then I do the same for line 1 filling in odd blobs.
Now I do check wether blobs in even are connected to blobs in odd, and
change the odd ones to include the even one.
Even blobs that have no partner are fully found and pushed into the
outbuffer.
And now I fill the even blobs with line 2 and so one.
Makes sure at every pixel is only looked at once, and I don't think
the blob-comparing adds much overhead.
-
-
Here's something cool I found, though it's designed for "shrinkwrap" boundings and not boxes:
http://cgm.cs.mcgill.ca/~orm/rotcal.html
What language are you using? I'm thinking about writing some code to allow for a flood filler that won't work if an object is open (spilling out to the edges of the screen). It's for a paint program that lets multiple people use one canvas.
-
@ Karlos
Out of interest, what is this for?
-
@Waccoon
I'm using C. Well, I'm really using C++, but the code might be used in a related C project so it might as well be just C.
@Minator
I could tell you, but then I'd have to kill you :lol:
Nah, it isn't for anything important, it was just a small problem that cropped up in a recent project that I suddenly found interesting in it's own right.
Basically I am determining some bounding-box extents based on the opacity of an image for some processing I need to do. As it goes, contrary to what you might expect, I need the actual bounding box and not the pixel-exact 'mask' of the area.
That said, even if one does find the pixel exact mask, one can easily determine the bounding box, but I thought there might be a quicker way that takes advantage of not needing the exact mask...
-
Ahhh... Karlos... just get a 4Ghz (or preferably an Athlon64 4000+) machine and brute force it, by scanning every pixel :-D
-
bleh, you lazy, unoptimizing git :lol:
-
Karlos wrote:
bleh, you lazy, unoptimizing git :lol:
:roflmao: Well what you are suggesting is like trying to dig a tunnel with spoon... use a spade man!!! ;-)
-
You know, it's an interesting in point.
I know a few people who write the most atrocious code imaginable, simply because they are spoiled by the performance of their hardware. People who should not be let anywhere near a compiler, for that matter. Their only skill is in the application of brute force to solve any problem. Then they use some convenience-oriented, high-level language/design tools with which to write their ill thought out code.
I worry that the arts of problem analysis, algorithm design and implementation efficiency are fast dying.
-
You...you.....geek! :-P
-
Karlos wrote:
You know, it's an interesting in point.
I know a few people who write the most atrocious code imaginable, simply because they are spoiled by the performance of their hardware. People who should not be let anywhere near a compiler, for that matter. Their only skill is in the application of brute force to solve any problem. Then they use some convenience-oriented, high-level language/design tools with which to write their ill thought out code.
I would fit probably into that category. I loved the Amiga because I could program stuff in any sloppy way and get great results, much better than my Speccie/Amstrad/C64 friends :-D
I worry that the arts of problem analysis, algorithm design and implementation efficiency are a fast dying.
Well, I would suggest the shift is from a problem oriented view of program design to a solution one.
Err... We don't need to worry about the implementation of the problem, and instead worry more abotu the outcome of the solution... or something...
-
bloodline wrote:
Well, I would suggest the shift is from a problem oriented view of program design to a solution one.
Err... We don't need to worry about the implementation of the problem, and instead worry more abotu the outcome of the solution... or something...
Which is a completely bollox approach for this situation. And indeed most of the things I work on.
I guess I like the mental challenges of actually doing something well.
-
Karlos wrote:
bloodline wrote:
Well, I would suggest the shift is from a problem oriented view of program design to a solution one.
Err... We don't need to worry about the implementation of the problem, and instead worry more abotu the outcome of the solution... or something...
Which is a completely bollox approach for this situation. And indeed most of the things I work on.
I guess I like the mental challenges of actually doing something well.
Yeah, well I suppose when it's a computer problem... hmmm. Anyay, there is a reason why DSPs have been developed... and that's for this type of task... so get yourself a DSP or more horsepower :-D
-
You may try lookup 'Simple border tracing' in your favourite search engine.
Or try: http://ct.radiology.uiowa.edu/~jiangm/courses/dip/html/node126.html
To get a box around it, you have to get the max x,y and min x,y, of the list of coords. :)
-
@ 23JUL/JULES :-)
Thanks for the link. The mechanism there reminds me of a trick used to speed up a mandelbrot generator I tried once. It had a mask to define already calculated pixels and used this border tracing to determine the region that had reached the iteration limit, bound it off and 'flood fill' the area as being already rendered.
-
@Karlos
"..if you need to know which pixels are different, there are a couple of ways of working it out. Probably the easiest is to take the result of the 32-bit xor operation and mask it with four succesive 32-bit masks that each isolate a byte in the long word. Anything coming up as non zero is a difference"
Slightly more complicated. I want to determine the color diference of each color component between two equivalent pixels (e.g. in the same position) in two images of the same size. I can't use LONGs for that because the results of a diference might pass the boudaries of each BYTE in a LONG.
I'm also scanning all pixels :-D so this is the critical part of the code..
-
@Jose
Stay away from 24-bit packed pixel format (3 bytes per pixel). It is surely not of God's pixel representation ;-)
If you need to compare true colour data, have your code load it as 32-bit by expanding with an alpha channel. Even if you don't use the alpha for anything (which increases your datasize by 25%), the benefits of sensible alignment and longword accesses to the data more than make up for it.
I don't know how exactly you are handling your images but it is realtively straightforward to have datatypes load it and then use the DTM_READPIXELARRAY method to extract a 32-bit truecolour (ARGB) representation, even if the source data is not 32-bit.
-
@Karlos
Yes, but if I use the DTM_READPIXELARRAY to get an ARGB representation and the source file/memory region is in a different format, say 24bit-packed (isn't that and YUV what most Jpegs are in by the way?) won't things slow down the same cause the OS will have to do the conversion anyway ? I though about doing a routine just for the 24bit-packed format wich uses LONGs, treating each LONG as, successively, RGBR, then GBRG and BRGB... but I haven't looked at the details to see if it's possible in my case...
-
Karlos wrote:
@ 23JUL/JULES :-)
Thanks for the link. The mechanism there reminds me of a trick used to speed up a mandelbrot generator I tried once. It had a mask to define already calculated pixels and used this border tracing to determine the region that had reached the iteration limit, bound it off and 'flood fill' the area as being already rendered.
Ah damn. Beaten to it. I came up with the same algorithm after staring at the blue/gray picture for a while. I have the nagging feeling you could do something clever with edge detection algorithms followed by a fast byte/word scanner to get a rough idea where the outlines of the blobs are. Then you can let loose the above algorithm. Finally wrap up with a box-within-box algorithm and merge them as necessary. But you're looking at an O(N^high integer) algorithm at the very least.
By the way, if you can locate Nico François' TurboMandel (or MandelVroom, forgot the exact name) you can witness a fractal flood fill in action. It was quite a sight to see fractals rendered in seconds rather than hours on a lowly 68000 :-).