Image processing (processing.m
)
First thing I cropped the image to reduce sky area which is useless in the feature detection process.
Before adjust the image contrast, I tried to remove as much sky as possible, which could reduce the quality of the contrast improvement.
I changed the space color of the image to the HSV then I selected only a small subset of the value V;
in order to improve the quality of the mask I applied an iteration of imdilate
to remove the small blob. So I got a sky mask.
Before to apply the canny
function we need to define one channel image; I tried some method to convert 3 channels image to one channel image, some of them using a single channel of a specific space color as HSV, but in finally I decided to use the classical conversion to the gray scale image due to the inadequate results of other methods.
Then I tried to enhance the image contrast apply on it an adaptive histogram equalization exploiting method adapthisteq
, because of the high exposition of the sky I have to limit the area of the histogram equalization to the castle only (excluding the sky), to do this I used the sky mask previously computed with the exploiting roifilt2
function.
As last step before apply the edge detection algorithm I decided to rescale the image to reduce its size;
this behaviour showed to improve the quality of the edge detection and the following lines' detection.
Then I did the edge detection exploiting the canny
algorithm: this algorithm, different to the other differentiation methods, returns a binary image composed by lines, this result simplify the application of the lines' detection exploiting the Hough transformation.
I tuned the canny
algorithm parameters with the hysteresis thresholds of $\begin{bmatrix} 0.1 & 0.2 \end{bmatrix}$ and a sigma of the Gausian filter of
In order to detect the lines in the image I used the Hough transformation, this transformation, for each point in the image (in this case the edges image), defines a family of intersecting straight lines that in parametric plane (given by the pairs
Due to the preprocessing I could maintain parameters like the default ones for the hough
function, I set the resolution of theta as houghlines
function setting a max gap between two points on the same line of
For this step I decided to apply only the histogram equalization on the image after the conversion to the gray scale. To detect the images features I tried several algorithms, then I chose the SURF one which produces the best result (several algorithms detect features only on the battlement).
Geometry (geometry.m
)
To simplify the developing of the required result I chose to write some classes and functions.
-
HX
Represents a homogeneous vector. The multiplication between two
HW
instances is interpreted as a cross product, moreover provides some function to draw as line or point. -
Seg
Represents a line segment. It provides method
line
to retrieve the associated line. -
SegGroup
Represents a group of line segments.
-
find_vanish_point
(method)finds the intersection point of the lines as an optimization problem. It imposes the problem as a solution of an equation system composed by eqautions in the form
$l^T p = 0$ and it solves this exploiting thesvd
method.
-
-
get_normalized_transformation
Given a set of homogeneous points returns a similar transformation to normalize them.
-
draw_axis
Given a projective matrix
$M$ , it draws the reference frame in the image.
Due to the experimental results I decided to include some hand-taken (and finding in different way) lines to improve the accuracy of the calculations.
To recover the affine properties of the images we have to put back the line at infinity in the image to its canonical position $\begin{bmatrix}0&0&1\end{bmatrix}^T$.
In order to do it we need to compute the infinity line for the plane
So I selected some lines parallel to plane SegGroup
to group the parallel line segments. With the method find_vanish_point
I retrieve the vanishes points corresponding to the three lines groups.
The find_vanish_point
sets the problem to find a vanish point as a minimization one. We know that a point
In order to reduce the error given by the svd
function, the line coordinates are normalized rescaling them around zero.
Found the 3 vanish points I use them to find the infinity line in the image, which have to pass to all the three vanish points. Due to the noise the infinity line cannot satisfy the relation svd
function after the data were been normalized to reduce errors.
The projective transformation to restore the affinity property can be written as
So, the transformation to restore affinity properties is
Apply the transformation to the image we can restore the affine properties.
In order to recover the metric properties, we can exploit the line infinity conic
where $I = \begin{bmatrix} 1 & i & 0 \end{bmatrix}^T$ and $J = \begin{bmatrix} 1 & -i & 0 \end{bmatrix}^T$ are the circular points.
The form of
We can exploit the relation svd
method.
Gotten
Affine transformation may include a mirror effect, so I decide to remove this effect to re-orientate the image in the original orientation if this effect appears.
So, the affine transformation to restore metric properties is
To better evaluate the points on plane
The similar homography included rotation, translation and scale is
The overall homography to map the image points of the plane
I selected the vertical lines shown in the figure to find the vertical vanish point. As previously I put the vertical line segments in an instance of SegGroup
and I used the method find_vanish_point
to find the intersection point of the segments' associated lines. As previously saw, the function find_vanish_point
solve the problem to find intersection point as an optimization one, exploiting svd
after the data normalization.
Now, we have 4 vanishes points and a metric rectification homography that we can use to compute the intrinsic parameters of the camera.
In particular, I used the technique of the conical fitting. This technique is based on set an optimization problem to find the conic
it is invariant to scale then it has 5 dof.
As first constraint I set that
Then, I set the first soft constraint exploiting the homography computed at the point G1: given a metric rectified homography (
The others two required constraints are given by ones based on orthogonality: given two vanish points
This equation gives two other constraints.
For these, I chose the vertical vanish point
The 2 kind of soft constraints cannot be used directly with the svd
method, indeed we need to write these constraints in the form
The first kind of constraint
Instead, the second kind
So, our constraints matrix
Due to linear dependency of the row of $A_r$ one of them could be dropped, this would not afflict the result
Due to the hard constraint
Before compose the matrix for the svd
the data were been normalized exploiting the function get_normalized_transformation
which returns a transformation
So I applied the svd
function to
The optimization method might return a negative definite $\omega$ matrix, while Cholesky factorization works only on positive definite matrix, but $\omega$ is invariant to rescale factor so if this happens it is enough to change the sign of $\omega$
As last step (because of the data normalization) I removed the precondition effect from
We need to compute a projective matrix which allow us to express the 3D points in an arbitrary reference frame different from the camera one.
This "world" reference frame to which we can refer the 3D point is the same put on the plane
So we need to compute the transformation between the camera reference frame and the world reference frame in the form $\begin{bmatrix}R & t\end{bmatrix}$ also called extrinsic parameters of the camera.
To do this I used the knowledge of the metric rectification;
indeed the homography computed in the point G1 gives us the information to move 2D point of plane
where
The projective matrix
The resulting projective matrix is
I chose to use the homography in the meters unit, so I can express the 3D points directly in meters.
So, computed the projective matrix for the world reference frame I could easily draw it in the image.
I could be also determinate the position of the camera
We need to rectify a castle's facade exploiting the knowledge of matrix
Then I computed the camera matrix expressed in the new reference frame as
I defined 4 rectangle corners in the 3D world expressed in the new reference frame
Having 4 points in a plane in 3D space and their projections in the image I could define a homography that rectify the facade 1.