During 2019.92020.12019.9 \sim 2020.1,I was involved in the Mix-Reality class guided by Hujun Bao.

The topic in this class is similar with computer vision. I’m really regretful because I didn’t listen to Mr. Bao’s class carefully, even though he explained the knowledge points carefully.

Mr. Bao sent me a book written by himself. I must read it when I’m free.

Singular Value decomposition(SVD)

  • a factorization of a normal matrix, extended from eigendecomposition.
  • An×m=Un×nΣn×mVm×mTA_{n \times m} = U_{n \times n}\Sigma_{n \times m}V^T_{m \times m}
    • (ATA)vi=λivi(A^TA)v_i = \lambda_iv_i
    • singular values: σi=λi\sigma_i = \sqrt{\lambda_i}
    • ui=1σiAviu_i=\frac{1}{\sigma_i}Av_i
  • One can easily verify that the square matrix also satisfies this definition(the same as eigendecomposition).
  • U,VU,V are orthogonal matrices
  • Usually we set r(0,rk(A)]r \in (0,rk(A)] to approximate SVD.

Transfomation in 2D

Name Function Preserve DOF
Isometries rotation, translation distance 33
Similarities [above], scale ratio of lengths, angles 44
Affinities parallel lines, ratio of areas and lengths 66
Projective cross ratio of 4 collinear points, collinearity 88
  • Rotation+Scaling+Translation

  • Affinities

  • Projective

    • find HH
      • Due to projective transformation, they are in 3D Homogeneous Coordinates and x×Hx=0x' \times Hx = 0.
      • Rewrite 99 parameters from HH in a column vector hh. For one pair of points, it can be derived that A3×9h9×1=0A_{3\times9}h_{9 \times 1}=0. Note that although there are 33 equations, only 22 of them are independent. So finally we can acquire that A2N×9h=0A_{2N\times9}\cdot h=0
      • Use SVD to solve this equation: A=U2N×9Σ9×9V9×9TA=U_{2N\times9}\Sigma_{9\times9}V^T_{9\times9}. hh is is the last column of VTV^T.
  • Cross in Matrix

Camera Model

  • Pinhole camera

    • Because the point is not exactly at the center, we should add shift parameters cxc_x and cyc_y. So that x=fxx+cx,y=fyy+cyx'=f_xx+c_x, y'=f_yy+c_y.

    • Why the aperture cannot be too small?

      • Less light passes through
      • Diffraction effect
  • Lenses

    • For thin lense:

Camera Calibration

  • intrinsic parameters
    • From Pinhole Camera Model, totally 44 parameters. Use the trick of Homogeneous Coordinates, finally:

  • extrinsic parameters
    • rotation and translation
    • 66 parameters: (θ,ϕ,ψ,cx,cy,cz)(\theta, \phi, \psi, c_x, c_y, c_z)
  • distortion parameters
  • Radial distortion

      ![](Mix-Reality/camera5.png)
    
  • Tangential distortion

      ![](Mix-Reality/camera7.png)
    
    • 55 parameters: (k1,k2,k3,p1,p2)(k1,k2,k3,p1,p2)
  • Camera Calibration

    • Without distortion, the transform matrices are as follows (ss is the Skew parameter):

    • parameters number: 5+3+3=115+3+3=11. Need 66 correspondences.

  • Homogeneous M×NM \times N Linear Systems

    • Ax=0Ax=0, AM×N,M>NA_{M \times N}, M > N
    • To find non-zero solution, Minimize Ax2|Ax|^2 under the constraint x2=1|x|^2=1.
    • A possible method: Direct Linear Transformation
    • General method for Calibration Problem: Compute SVD decomposition of AA, the last column of V gives xx.
    • Degenerate cases
      • Points cannot lie on the same plane.
      • Points cannot lie on the intersection curve of two quadric surfaces.
  • Taking Radial Distortion into Account

    • nonlinear
    • Methods
      • Newton Method
      • Levenberg-Marquardt Algorithm
    • The latter doesn’t require the computation of HH.

Stereo-view Geometry

  • Sets of parallel lines on the same plane lead to collinear vanishing points.

  • Epipolar Geometry 对极几何

  • Epipolar Constraint

    • Denote p=K[I,0]Pp=K[I,0]P and p=K[R,T]Pp'=K[R,T]P.

    • Let x=K1px=K^{-1}p, finally we get that xT[T×(Rx)]=0x^T \cdot [T \times (Rx')] = 0, which is called Epipolar Constraint. It means that vector xTx^T,TT and RxRx' are coplanar.

    • Denote E=T×RE=T \times R, then xTEx=0x^TEx'=0, EE is called Essential Matrix.

    • Properties about Essential Matrix

    • Write back KK(may different between cameras), FF is called Fundamental Matrix.

      • Properties about Fundamental Matrix is similar to essential matrix. 77 DOF.
  • Solve for Fundamental Matrix

    • Wf=0Wf = 0 (f9×1f_{9 \times 1} collects the parameters in FF).
      • If rk(W)=8rk(W)=8: exists unqiue ff.
      • If rk(W)>8rk(W)>8: find F^\hat Fcalculated by SVD.
    • Note that FF’s rank is 22 but F^\hat F may not. Second equation: FF^=0||F-\hat F|| =0 and det(F)=0det(F)=0
    • Normalization
      • Transform one image first before calculating FF.
      • Find a transform that: Origin (1) centroid of image points. (2) Mean square distance of the data points from origin is 22 pixels.

Stereo systems

  • Some applications

    • Stereo vision: Estimate the position of PP given the observation of PP from two view points.
    • Triangulation: Intersecting the two lines of sight gives rise to PP
  • Making image planes parallel

    • Goal: Estimate the perspective transformation HH that makes the images parallel.
    • To be continued…
  • Correspondence problem

    • Correlation Methods
    • Smaller window
      • More detail
      • More noise
    • Larger window
      • Smoother disparity maps
      • Less prone to noise
    • To be continued…

Image Processing

  • Filters

    • goals

      • Extract useful information from the images
      • Modify or enhance image properties
    • Gaussian Filters

      • Rule of thumb: set filter half-width to about 3σ3\sigma
      • Separable kernel; Convolution with self is another Gaussian
    • Median and Mean filter

  • Differentiation

  • Sub-sampling

    • Problem: Aliasing
    • Sampling Theorem (Nyquist): When sampling a signal at discrete intervals, the sampling frequency must be 2fmax2f_{max}(ff is frequency).
  • Edge Detection

    • Edge: a location with high gradient
    • Most widely used method: Canny Edge Detection
      1. Gaussian smoothing
      2. Derivative of Gaussian
      3. Find magnitude and orientation of gradient
      4. Extract edge points: “Non-maximum suppression”
      5. Linking and thresholding “Hysteresis”
  • Harris Corner Detector

    • Denote E(u,v)=x,yw(x,y)[I(x+u,y+u)I(x,y)]2E(u,v)=\sum \limits_{x,y} w(x,y)[I(x+u,y+u)-I(x,y)]^2. The corner has bigger E(u,v)E(u,v).

    • Using bilinear approximation, we can derive that:

      • λ1λ2\lambda1 \sim \lambda2: Corner
      • λ1λ2\lambda1 \gg \lambda2: Edge
    • Set R=det(M)kTrace(M)\rm R=det(M)-k \cdot Trace(M), use RR to judge corners (k[0.04,0.06]k \in [0.04,0.06]).

    • Property: Rotation invariance

Fitting

  • Goal: Choose a parametric model to fit a certain quantity from data.