|
|
|
|
|
| Edge image thresholding | |
| Edge relaxation | |
| Border tracing | |
| Edge following as graph searching | |
| Edge following as dynamic programming | |
| Hough transforms | |
| Border detection using border location information | |
| Region construction from borders |
![]()
| Edge-based segmentation represents a large group of methods based on
information about edges in the image | |
| Edge-based segmentations rely on edges found in an image by edge detecting
operators -- these edges mark image locations of discontinuities in gray
level, color, texture, etc. | |
| Image resulting from edge detection cannot be used as a segmentation
result. Supplementary processing steps must follow to combine edges
into edge chains that correspond better with borders in the image. | |
| The final aim is to reach at least a partial segmentation -- that is, to
group local edges into an image where only edge chains with a correspondence
to existing objects or image parts are present. | |
| The more prior information that is available to the segmentation process,
the better the segmentation results that can be obtained. | |
| The most common problems of edge-based segmentation are an edge presence in locations where there is no border, and no edge presence where a real border exists. (false alarms and missed detections) |
| Almost no zero-value pixels are present in an edge image, but small edge values correspond to non-significant gray level changes resulting from quantization noise, small lighting irregularities, etc. | |
| Selection of an appropriate global threshold is often difficult and
sometimes impossible; p-tile thresholding can be applied to define a
threshold | |
| Alternatively, non-maximal suppression and hysteresis thresholding can be used as was introduced in the Canny edge detector. |
| Borders resulting from the previous method are strongly affected by image
noise, often with important parts missing. | |||||||||||||
| Considering edge properties in the context of their mutual neighbors can
increase the quality of the resulting image. | |||||||||||||
| All the image properties, including those of further edge existence, are
iteratively evaluated with more precision until the edge context is totally
clear - based on the strength of edges in a specified local neighborhood,
the confidence of each edge is either increased or decreased. | |||||||||||||
| A weak edge positioned between two strong edges provides an example of
context; it is highly probable that this inter-positioned weak edge should
be a part of a resulting boundary. If, on the other hand, an edge
(even a strong one) is positioned by itself with no supporting context, it
is probably not a part of any border. | |||||||||||||
Edge context is considered at both ends of an edge, giving the minimal
edge neighborhood. ![]() | |||||||||||||
| The central edge e has a vertex at each of its ends and three
possible border continuations can be found from both of these vertices. | |||||||||||||
Vertex type -- number of edges emanating from the vertex, not
counting the edge e. The type of edge e can then be
represented using a number pair i-j describing edge patterns at each
vertex, where i and j are the vertex types of the edge e.
The following situations are possible:
| |||||||||||||
| Edge relaxation is an iterative method, with edge confidences converging
either to zero (edge termination) or one (the edge forms a border). | |||||||||||||
The confidence of each edge e in the first iteration can be defined as a
normalized magnitude of the crack edge, with normalization based on
either the global maximum of crack edges in the whole image, or on a local
maximum in some large neighborhood of the edge ![]() | |||||||||||||
The main steps of the above Algorithm are evaluation of vertex types
followed by evaluation of edge types, and the manner in which the edge
confidences are modified. ![]() | |||||||||||||
A vertex is considered to be of type i if ![]() where a,b,c are the normalized values of the other incident crack edges m=max(a,b,c,q) ... the introduction of the quantity q ensures that type(0) is non-zero for small values of a. | |||||||||||||
| For example, choosing q=0.1, a vertex (a,b,c)=(0.5, 0.05, 0.05) is a type
1 vertex, while a vertex (0.3, 0.2, 0.2) is a type 3 vertex. | |||||||||||||
| Similar results can be obtained by simply counting the number of edges
emanating from the vertex above a threshold value. | |||||||||||||
| Edge type is found as a simple concatenation of vertex types, and edge
confidences are modified as follows:
| |||||||||||||
| Edge relaxation, as described above, rapidly improves the initial edge
labeling in the first few iterations. | |||||||||||||
| Unfortunately, it often slowly drifts giving worse results than expected
after larger numbers of iterations. The reason for this strange behavior is in searching for the global maximum of the edge consistency criterion over all the image, which may not give locally optimal results. A solution is found in setting edge confidences to zero under a certain threshold, and to one over another threshold which increases the influence of original image data. | |||||||||||||
Therefore, one additional step must be added to the edge confidence
computation ![]() where T_1 and T_2 are parameters controlling the edge relaxation convergence speed and resulting border accuracy. | |||||||||||||
| This method makes multiple labeling possible; the existence of two edges at different directions in one pixel may occur in corners, crosses, etc. |
| If a region border is not known but regions have been defined in the
image, borders can be uniquely detected. | |||||||||||||||||||||||
First, let us assume that the image with regions is either binary or that
regions have been labeled.
| |||||||||||||||||||||||
Boundary properties better than those of outer borders may be found in extended
borders.
| |||||||||||||||||||||||
The extended boundary is defined using 8-neighborhoods e.g.P_4(P)
denotes the pixel immediately to the left of pixel P. Four kinds of
inner boundary pixels of a region R are defined; if Q denotes
pixels outside the region R, then a pixel P is an element
of R:
| |||||||||||||||||||||||
Let LEFT(R), RIGHT(R), UPPER(R), LOWER(R) represent the corresponding
subsets of R. The extended boundary EB is defined as a set of points
P,P_0,P_6,P_7 satisfying the following conditions: ![]() | |||||||||||||||||||||||
The extended boundary can easily be constructed from the outer boundary. Using
an intuitive definition of RIGHT, LEFT, UPPER, and LOWER outer boundary
points, the extended boundary may be obtained by shifting all the UPPER
outer boundary points one pixel down and right, shifting all the LEFT outer
boundary points one pixel to the right, and shifting all the RIGHT outer
boundary points one pixel down. The LOWER outer boundary point positions
remain unchanged. ![]() | |||||||||||||||||||||||
A more sophisticated approach is based on detecting common boundary
segments between adjacent regions and vertex points in boundary segment
connections. The detection process is based on a look-up table, which
defines all 12 possible situations of the local configuration of 2 x2 pixel
windows, depending on the previous detected direction of boundary, and on
the status of window pixels which can be inside or outside a region.
| |||||||||||||||||||||||
| A more difficult situation is encountered if the borders are traced in grey level images where regions have not yet been defined. The border is represented by a simple path of high-gradient pixels in the image. Border tracing should be started in a pixel with a high probability of being a border element, and then border construction is based on the idea of adding the next elements which are in the most probable direction. To find the following border elements, edge gradient magnitudes and directions are usually computed in pixels of probable border continuation. |
![]()
| Whenever additional knowledge is available for boundary detection, it
should be used - e.g., known approximate starting point and ending point of
the border. Even some relatively weak additional requirements such as
smoothness, low curvature, etc. may be included as prior knowledge. | |||||||||
| A graph is a general structure consisting of a set of nodes n_i and
arcs between the nodes [n_i,n_j]. We consider oriented and
numerically weighted arcs, these weights being called costs. | |||||||||
| The border detection process is transformed into a search for the optimal
path in the weighted graph. Assume that both edge magnitude and edge
direction information is available in an edge image. | |||||||||
Figure 5.21a shows an image of edge directions, with only significant
edges according to their magnitudes listed. Figure 5.21b shows an
oriented graph constructed in accordance with the presented principles. ![]() | |||||||||
| To use graph search for region border detection, a method of oriented
weighted-graph expansion must first be defined. | |||||||||
Graph Search Example
| |||||||||
| If no additional requirements are set on the graph construction and
search, this process can easily result in an infinite loop. | |||||||||
| To prevent this behavior, no node expansion is allowed that puts a node on
the OPEN list if this node has already been visited, and put on the OPEN
list in the past. A simple solution to the loop problem is not to
allow searching in a backward direction. It may be possible to
straighten the processed image (and the corresponding graph). | |||||||||
| The estimate of the cost of the path from the current node n_i to the end
node n_B has a substantial influence on the search behavior. | |||||||||
| If this estimate ~h(n_i) of the true cost h(n_i) is not considered, so ~h(n_i)=0,
no heuristic is included in the algorithm and a breadth-first search is
done. The detected path will always be optimal according to the
criterion used, the minimum cost path will always be found. Applying
heuristics, the detected cost does not always have to be optimal but the
search can often be much faster. | |||||||||
| The minimum cost path result can be guaranteed if ~h(n_i)<= h(n_i) and
if the true cost of any part of the path c(n_p,n_q) is larger than the
estimated cost of this part ~c(n_p,n_q). The closer the estimate ~h(n_i)
is to h(n_i), the lower the number of nodes expanded in the search. The
problem is that the exact cost of the path from the node n_i to the end node
n_B is not known beforehand. | |||||||||
| In some applications, it may be more important to get a quick rather than
an optimal solution. Choosing ~h(n_i)>h(n_i), optimality is not
guaranteed but the number of expanded nodes will typically be smaller
because the search can be stopped before the optimum is found. | |||||||||
| If ~h(n_i)=0, the algorithm produces a minimum-cost search. If ~h(n_i)>h(n_i), the algorithm may run faster, but the minimum-cost result is not guaranteed. If ~h(n_i)<= h(n_i), the search will produce the minimum-cost path if and only if the conditions presented earlier are satisfied. If ~h(n_i)= h(n_i), the search will always produce the minimum-cost path with a minimum number of expanded nodes. | |||||||||
| The better the estimate of h(n), the smaller the number of nodes that must be expanded. | |||||||||
| A crucial question is how to choose the evaluation cost functions for
graph-search border detection. | |||||||||
Some generally applicable cost functions are
| |||||||||
| Since the range of border detection applications is quite wide, the cost
functions described may need some modification to be relevant to a
particular task. | |||||||||
| Graph searching techniques offer a convenient way to ensure global
optimality of the detected contour. The detection of closed structure
contours would involve geometrically transforming the image using a polar to
rectangular co-ordinate transformation in order to `straighten' the contour. | |||||||||
| Searching for all the borders in the image without knowledge of the start and end-points is more complex. Edge chains may be constructed by applying a bidirectional heuristic search in which half of each 8-neighborhood expanded node is considered as lying in front of the edge, the second half as lying behind the edge. |
![]()
| Dynamic programming is an optimization method based on the principle of
optimality. It searches for optima of functions in which not all
variables are simultaneously interrelated. | |||||||||||||||
| The main idea of the principle of optimality is: Whatever the path to the node E was, there exists an optimal path between E and the end-point. In other words, if the optimal path start-point -- end-point goes through E then both its parts start-point -- E and E -- end-point are also optimal. If the graph has more layers, the process is repeated until one of the end-points is reached. ![]() The complete graph must be constructed to apply dynamic programming, and this may follow general rules given in the previous section. The objective function must be separable and monotonic (as for the A-algorithm); evaluation functions presented in the previous section may also be appropriate for dynamic programming. | |||||||||||||||
Comparisons ... Heuristic Graph Search vs. Dynamic Programming
|
![]()
If an image consists of objects with known shape and size, segmentation
can be viewed as a problem of finding this object within an image.
Such as a line which can be defined by its slope and y-intercept or a circle
defined by its radius and position![]() | |
| The original Hough transform was designed to detect straight lines and
curves. A big advantage of this approach is robustness of segmentation
results; segmentation is not too sensitive to imperfect data or noise. | |
This means that any straight line in the image is represented by a single
point in the k,q parameter space and any part of this straight line is
transformed into the same point. The main idea of line detection is to
determine all the possible line pixels in the image, to transform all lines
that can go through these pixels into corresponding points in the parameter
space, and to detect the points (a,b) in the parameter space that frequently
resulted from the Hough transform of lines y=ax+b in the image. ![]() | |
Detection of all possible line pixels in the image may be achieved by
applying an edge detector to the image. Then, all pixels with edge magnitude
exceeding some threshold can be considered possible line pixels. ![]() | |
| In the most general case, nothing is known about lines in the image, and therefore lines of any direction may go through any of the edge pixels. In reality, the number of these lines is infinite, however, for practical purposes, only a limited number of line directions may be considered. The possible directions of lines define a discretization of the parameter k. Similarly, the parameter q is sampled into a limited number of values. The parameter space is not continuous any more, but rather is represented by a rectangular structure of cells. This array of cells is called the accumulator array A, whose elements are accumulator cells A(k,q). | |
| For each edge pixel, parameters k,q are determined which represent lines
of allowed directions going through this pixel. For each such line, the
values of line parameters k,q are used to increase the value of the
accumulator cell A(k,q). Clearly, if a line represented by an equation
y=ax+b is present in the image, the value of the accumulator cell A(a,b)
will be increased many times -- as many times as the line y=ax+b is detected
as a line possibly going through any of the edge pixels. | |
| Lines existing in the image may be detected as high-valued accumulator
cells in the accumulator array, and the parameters of the detected line are
specified by the accumulator array co-ordinates. As a result, line
detection in the image is transformed to detection of local maxima in the
accumulator space. |
| The parametric equation of the line y=kx+q is appropriate only for
explanation of the Hough transform principles -- it causes difficulties in
vertical line detection (k -> infinity) and in nonlinear discretization
of the parameter k. | |
| If a line is represented as the Hough transform does not suffer from these limitations. ![]() | |
| Discretization of the parameter space is an important part of this approach. Also, detecting the local maxima in the accumulator array is a non-trivial problem. In reality, the resulting discrete parameter space usually has more than one local maximum per line existing in the image, and smoothing the discrete parameter space may be a solution. | |
Algorithm 5.15 Curve detection using the Hough Transform
| |
| Generalization to more complex curves that can be described by an analytic
equation is straightforward. Consider an arbitrary curve represented
by an equation f(x,a)=0, where {a} is the vector of
curve parameters. | |
| If we are looking for circles, the analytic expression f(x,a) of
the circle with center (a,b) and radius r is the accumulator data structure must be three-dimensional. | |
| Even though the Hough transform is a very powerful technique for curve
detection, exponential growth of the accumulator data structure with the
increase of the number of curve parameters restricts its practical usability
to curves with few parameters. | |
| If prior information about edge directions is used, computational demands can be decreased significantly. Consider the case of searching the circular boundary of a dark region, letting the circle have a constant radius r=R for simplicity. Without using edge direction information, all accumulator cells A(a,b) are incremented in the parameter space if the corresponding point (a,b) is on a circle with center x. | |
With knowledge of direction, only a small number of the accumulator cells
need be incremented. For example, if edge directions are quantized into 8
possible values, only one eighth of the circle need take part in
incrementing of accumulator cells. Using edge directions, candidates
for parameters a and b can be identified from the following formulae:
where phi(x) refers to the edge direction in pixel x and Delta phi is the maximum anticipated edge direction error. Another heuristic that has a beneficial influence on the curve search is to weight the contributions to accumulator cells A(a) by the edge magnitude in pixel x
| |
| The Generalized Hough Transform constructs a parametric curve based
on situations seen in the learning stage.
|
![]()