Finite Set Statistics for Multiple-Object Tracking

Single Object Tracking

  • Single Object Tracking

  • “Simple” Multiple Object Tracking

  • Random Finite Sets

  • The Probability Hypothesis Density Filter

Recall on Bayes Rule

\[ P(A \vert B) = \frac{ P(A) P(B \vert A) }{ P(B) } \qquad(1)\]

\[ P(A \vert B) = \frac{ P(A) P(B \vert A) }{ \sum_A P(A) P(B \vert A) } \]

Using PDFs :

\[ p(x \vert y) = \frac{ p(x) p(y \vert x) }{ \int p(x) p(y \vert x)dx } \]

State estimation using Bayes Rule

If we have a prior on a state (eg. localization) \(p(x_k \vert z^{k-1})\), we can take into account some observations \(z_k\) using the likelihood function

\[ p(x_k \vert z^k) = \frac{ p(x_k \vert z^{k-1}) g(z_k \vert x_k) }{ \int p(x_k \vert z^{k-1}) g(z_k \vert x_k)dx_k } \]

We can compute the prior for the time \(k\) from time \(k-1\) using the Chapman-Kolmogorov equation :

\[ p(x_k \vert z^k) = \frac{ \int f(x_k \vert x_{k-1})p(x_{k-1} \vert z^{k-1})g(z_k \vert x_k) dx_{k-1} }{ \iint f(x_k \vert x_{k-1})p(x_{k-1} \vert z^{k-1})g(z_k \vert x_k) dx_{k-1}dx_k } \]

A particular case - the Kalman Filter

Assuming linear evolution and observation model and gaussian centered noises, we have

\[ p(x_{k} \vert z^{k-1}) = \mathcal{N}(x_{k}; \mu_{k \vert k-1}, P_{k \vert k-1}) \]

and

\[ g(z_k \vert x_k) = \mathcal{N}(z_k; H_k x_k, R_k) \]

The posterior density obtained by the Kalman update is the solution of the Bayes rule with previous assumptions:

\[ p(x_k \vert z^k) = \frac{ \mathcal{N}(x_{k}; \mu_{k \vert k-1}, P_{k \vert k-1}) \mathcal{N}(z_k; H_k x_k, R_k) }{ \int \mathcal{N}(x_{k}; \mu_{k \vert k-1}, P_{k \vert k-1}) \mathcal{N}(z_k; H_k x_k, R_k)dx_k } \]

Object Tracking using KF

Scenario

Toy example : a single target moving along 1-D, with an associated constant velocity model :

\[ X_k = \begin{pmatrix} x_k \\ v_k \end{pmatrix},\ X_{k+1} = \begin{pmatrix} 1 & \Delta t \\ 0 & 1 \end{pmatrix} X_k \]

We only observe \(x\), so the observation model is :

\[ z_k = \begin{pmatrix} 1 & 0 \end{pmatrix} X_k \]

Assume process noise \(Q = \begin{pmatrix}0.1 & 0 \\ 0 & 0.2\end{pmatrix}\) and observation noise \(R = 0.3\)

Object Tracking using KF

Prior

Figure 1: Prior density on \(X\)

Object Tracking using KF

Prediction 1

Figure 2: Prior and prediction of \(X\)

Object Tracking using KF

Observation 1

Figure 3: Prediction of \(X\) and first observation \(z\)

Object Tracking using KF

Update 1

Figure 4: Kalman update using prediction and observation

Object Tracking using KF

Prediction 2

Figure 5: Update result and prediction of next timestep

Object Tracking using KF

Observation 2

Figure 6: Prediction and second observation

Object Tracking using KF

Update 2

Figure 7: Kalman update using prediction and observation

“Simple” Multiple Object Tracking

  • Single Object Tracking

  • “Simple” Multiple Object Tracking

  • Random Finite Sets

  • The Probability Hypothesis Density Filter

Problematic

How to track a known and constant number of targets?

Figure 8: A problem of multiple object tracking

Conceptual solution: use 1 KF per track, associate measurements to tracks (NN, GNN,…).

Prediction

Figure 9: Independent prediction for each target

Observations and association

Figure 10: Predictions and observations

Update

Figure 11: Update step

Limits

How to deal with

  • unknown / time-varying number of tracks;
  • missdetections;
  • clutter measurements (False Positives);
  • the confidence into the existence of the targets?

Random Finite Sets

  • Single Object Tracking

  • “Simple” Multiple Object Tracking

  • Random Finite Sets

  • The Probability Hypothesis Density Filter

What is a Finite Set?

A set is a collection of different things, called its elements. Sets are unordered.

\(X = \{\} = \emptyset\) is the empty set.

\(X = \{x\}\) is a singleton.

They may or may not contain an infinite number of elements.

A Finite Set is a set containing a finite number of elements.

The number of elements is the cardinality of the set, denoted \(\lvert X \rvert\)

Figure 12: Example of a set of polygons

Random Finite Set

Definition

A Random Finite Set (RFS) is a set containing a random number of random variables ie,

\[ X = \left\{ x_1, \dots, x_n \right\} \]

where both the cardinality \(n\) and the \(x_i\) are randomly distributed.

Elements of a RFS

The \(x_i\) may be random vectors of any dimension, for example the position of a target \(\begin{pmatrix}x \\ y \end{pmatrix}\) or its pose \(\begin{pmatrix}x \\ y \\ \theta \end{pmatrix}\).

Set Integral, PDF

Multi-object PDF \(\pi(X) \geq 0\), s.t.

\[ \int \pi(X)dX = 1 \]

based on the set integral 1

\[ \int \pi(X) = \sum_{i=0}^{\infty}\rho(i) \]

With \(\rho(i)\) the cardinality distribution of \(X\):

\[ \rho(i) \triangleq \frac{1}{i!}\int\pi(x_1, \dots, x_i)dx_1 \dots dx_i \]

Common Random Finite Sets

Bernoulli RFS

\[ \pi(X) = \begin{cases} 1 - r & \text{if } X = \emptyset \\ r \cdot p(x) & \text{if } X = \left\{ x \right\} \\ 0 & \text{otherwise} \end{cases} \]

  • Defined by \((r, p)\)
  • Represents an element with existence probability \(r\) and distribution \(p\)
  • Cardinality is either 0 or 1.

Common Random Finite Sets

Multi-Bernoulli (MB)-RFS

  • Union of \(M\) independent Bernoulli RFS
  • Defined by \(\left\{ \left( r^{(i)}, p^{(i)} \right) \right\}_{i=1}^M\)
  • \(0 \leq \lvert X \rvert \leq M\)

Common Random Finite Sets

Poisson RFS

Also called Poisson Point Process

  • \(\rho(n) \sim \mathcal{P}(\bar\lambda)\)
  • \(x_i\) i.i.d \(\sim p(\cdot)\)
  • \(\pi(X) = e^{-\bar\lambda} \bar\lambda^n p(x_1)\dots p(x_n)\)

Commonly used to model clutter measurements (false positive).

Intensity Function of a Poisson RFS

The intensity function of a Poisson RFS is defined as

\[ \lambda(x) = \bar\lambda \cdot p(x) \]

We can get back \(\bar\lambda\) and \(p(\cdot)\)

\[ \bar\lambda = \int\lambda(x)dx \]

\[ p(x) = \frac{\lambda(x)}{\bar\lambda} = \frac{\lambda(x)}{\int\lambda(x)dx} \]

Realizations of a Poisson RFS

Example:

  • \(\rho(i) \sim \mathcal{P}(4)\)
  • \(p \sim U(-2, 2) \times U(-2, 2)\)

Figure 13: Realizations of a Poisson RFS

But, why Random Finite Sets?

  • Because they express both the uncertainty about the number of targets and the targets themselves.
  • Because we can use the Bayes rule with them:

\[ p(X_k \vert Z^k) = \frac{ p(X_k \vert Z^{k-1}) g(Z_k \vert X_k) }{ \int p(X_k \vert Z^{k-1}) g(Z_k \vert X_k) dX_k } \]

\(g(Z_k \vert X_k)\) is the multi-object likelihood.

Multiple object tracking using RFS

Let’s use a more complicated model:

  • unknown and time varying number of targets;
  • detection probability \(P_D(x) \leq 1\);
  • clutter modeled as a PPP with intensity function \(\lambda_c(\cdot)\);
  • measurements of one target is indep. from other measurements and targets;
  • each measurement is the result of at most one object.

Use RFS to represent the states (\(X_k\)) and the measurements (\(Z_k\)).

Motion Model

=> Compute both the update of existing elements, and birth of new objects.

Measurement Model

\(Z_k = \left\{ z_k^1, \dots, z_k^{m_k} \right\}\) and \(X_k = \left\{ x_k^i, \dots, x_k^{n_k} \right\}\)

Observations \(Z_k = C_k \cup O_k\)

Using the association hypothesis vector \(\theta_k = \left[ \theta_k^1, \dots, \theta_k^{n_k} \right]\) where

\[ \theta_k^i = \begin{cases} j & \text{if object } i \text{ is associated to measurement } j \\ 0 & \text{if objetc } i \text{ is undetected} \end{cases} \]

The observation model is

\[ p(Z_k \vert X_k) = \sum_{\theta_k} \left( \exp(-\bar\lambda_c) \prod_{j=1}^{m_k}\lambda_c(z_k^j)\prod_{i:\theta^i=0}(1 - P_D(x_k^i)) \\ \times \prod_{i: \theta^i \gt 0}\frac{ P_D(x_k^i)g_k(z_k^{\theta^i} \vert x_k^i) }{ \lambda_c(z_k^{\theta_k^i}) } \right) \]

Measurement Model - Other Form

\[ p(Z_k \vert X_k) = p_{c_k}(Z_k)g_k(\emptyset \vert X_k)\sum_{\theta_k}\prod_{i:\theta_k^i \ge 0}\frac{ P_D(x_k^i)g_k(z_k^{\theta_k^i} \vert x_k^i) }{ \lambda_c(z_k^{\theta_k^i})(1 - P_D(x_k^i)) } \]

with

\[ p_{c_k}(Z_k) = \exp(-\bar\lambda_c)\prod_{s=1}^{m_k}\lambda_c(z_k^s) \]

\[ g_k(\emptyset \vert X_k) = \prod_{j=1}^{n_k}(1 - P_D(x_k^j)) \]

Calculability

The previous exact equations are not calculable in practice => need to introduce some approximations.

The Probability Hypothesis Density Filter

  • Single Object Tracking

  • “Simple” Multiple Object Tracking

  • Random Finite Sets

  • The Probability Hypothesis Density Filter

The PHD function

The Probability Hypothesis Density \(D_X(x)\) of a RFS \(X\) is defined as:

\[ \begin{aligned} D_X(x) & = \int p_X(X)\sum_{x' \in X}\delta(x-x')dX \\ & = \int p_X(\left\{ x \right\} \cup X)dX \end{aligned} \]

It is the first-order moment of the RFS, also called intensity function of the RFS.

\[ \int_A D_X(x)dx = \mathbb{E}\left[\left| X \cap A \right|\right] \]

=> Expected number of elements in the subspace \(A\).

PHD of a Bernoulli RFS

If \(X\) is a Bernoulli RFS parametrized by \(r\) and \(p(\cdot)\), its PHD is

\[ D_X(x) = r \cdot p(x) \]

Example with \(r = 0.75\) and \(p(x) = \mathcal{N}(x; 1.2, 0.78)\)

Figure 14: PHD of a Bernoulli RFS

PHD of a Multi-Bernoulli RFS

If \(X\) is a Multi-Bernoulli RFS parametrized by \(\left\{ r^{(i)}, p^{(i)}(\cdot) \right\}_{i=1}^{N}\), its PHD is

\[ D_X(x) = \sum_{i=1}^{N} r^{(i)}p^{(i)}(x) \]

Example 1

MB-RFS with 2 components \((1, \mathcal{N}(-2.3, 1.4))\) and \((1, \mathcal{N}(1.2, 0.85))\)

Figure 15: PHD of a Multi-Bernoulli RFS - eg. 1

Example 2

MB-RFS with 2 components \((0.95, \mathcal{N}(-2.3, 1.4))\) and \((0.65, \mathcal{N}(1.2, 0.85))\)

Figure 16: PHD of a Multi-Bernoulli RFS - eg. 2

PHD of a Poisson RFS

If \(X\) is a Poisson RFS with intensity function \(\lambda(\cdot)\), its PHD is

\[ D_X(x) = \lambda(x) \]

Approximation of a RFS using the PHD

As the PHD is the exact representation of a Poisson RFS, one may approximate any RFS by a Poisson RFS parametrized by its PHD.

Example

Let \(X\) a MB-RFS with 2 components \((0.95, \mathcal{N}(-2.3, 1.4))\) and \((0.65, \mathcal{N}(1.2, 0.85))\).

\[ \begin{aligned} D_X(x) & = 0.95 \mathcal{N}(x; -2.3, 1.4) + 0.65 \mathcal{N}(1.2, 0.85) \end{aligned} \]

\(X\) can be approximated by a PPP with intensity function \(\lambda(x) = D_X(x)\)

PHD Filter - Toy Example

  • 2 features located at \(-1.0\) and \(1.0\) along the x axis.
  • The y axis is such that \(y(x) = PHD(x)\).

The prior is composed of two gaussians \(\mathcal{N}(-1.02, 0.3^2)\) and \(\mathcal{N}(1.1, 0.4^2)\) with respective weights \(0.85\) and \(0.92\).

Figure 17: Prior PHD of the filter

First update

Observations

Suppose we observe one object at \(0.99\), but we miss the object at \(-1\). The noise model is set at \(R = 0.15\).

We can plot the observation above the prior:

Figure 18: Prior PHD and observation

First update

Kalman update

Perform a Kalman update for each component of the prior, using the observation :

Code
# There is only one observation => no loop on observation set

w_sum = 0
tmp_gm = []
kalman_updated_gm = []

# decompose component as weight + normal
for w, n in prior_gm:
  S = R + n.var() # innovation covariance
  K = n.var() / S # Kalman gain
  error = o_1.mean() - n.mean() # Innovation

  mu = n.mean() + K * error
  var = (1 - K) * n.var()
  # Numerator of the weight update
  weight = Pd * w * norm.pdf(o_1.mean(), n.mean(), S)
  w_sum += weight
  tmp_gm.append((weight, norm(mu, sqrt(var))))

# Division for weights
for w, n in tmp_gm:
  weight = w / (c_z + w_sum)
  kalman_updated_gm.append((weight, n))

Figure 19: Kalman update using observation

First update

Take into account potentially missed objects

Multiply the predicted (prior) PHD by \(1- P_D\) to add the possibility of missed objects:

Figure 20: Add component to represent miss-detection hypotheses

First update

Add the previous PHDs to finish the update

Figure 21: Updated PHD

Prediction

Propagate previous update result

Scale by \(P_S\) the probability of survival :

Figure 22: Predicted PHD from previous step

Prediction

Birth model

Create a component for each observation in previous step. The birth model is follows a \(\mathcal{N}(\mu, R)\) with weight \(0.2\) (maybe too high in practice but fine for visualization).

Figure 23: Birth model

Prediction

The predicted PHD is the sum of the propagated PHD and the birth PHD.

Figure 24: Predicted PHD

Second update

Observations

Suppose this time we have the following observations (2 OK + clutter) :

Figure 25: Predicted PHD and observation

Second update

Kalman update

Update each gaussian component with each observation.

Code
kalman_updated_gm = []

for o in obs:
  w_sum = 0
  tmp_gm = []

  # decompose component as weight + normal
  for w, n in predicted_phd:
    S = R + n.var() # innovation covariance
    K = n.var() / S # Kalman gain
    error = o.mean() - n.mean() # Innovation

    mu = n.mean() + K * error
    var = (1 - K) * n.var()
    # Numerator of the weight update
    weight = Pd * w * norm.pdf(o.mean(), n.mean(), S)
    w_sum += weight
    tmp_gm.append((weight, norm(mu, sqrt(var))))

  # Division for weights
  for w, n in tmp_gm:
    weight = w / (c_z + w_sum)
    kalman_updated_gm.append((weight, n))

Figure 26: Kalman update using observation

Second update

Take into account potentially missed objects

Multiply the predicted PHD by \(1- P_D\) to add the possibility of missed objects:

Figure 27: Add components to represent miss-detected hypotheses

Second update

Add previous PHDs

Figure 28: Updated PHD

Prediction

Propagate previous update result

Scale by \(P_S\) the probability of survival :

Figure 29: Predicted PHD from previous step

Prediction

Birth model

Create a component for each observation in previous step. The birth model is follows a \(\mathcal{N}(\mu, R)\) with weight \(0.1\).

Figure 30: Birth model

Prediction

The predicted PHD is the sum of the propagated PHD and the birth PHD.

Figure 31: Predicted PHD

Third update

Observations

Suppose this time we have the following observations (2 OK with bigger noise for the 2nd) :

Figure 32: Predicted PHD and observation

Third update

Kalman update

Update each gaussian component with each observation.

Code
kalman_updated_gm = []

for o in obs:
  w_sum = 0
  tmp_gm = []

  # decompose component as weight + normal
  for w, n in predicted_phd:
    S = R + n.var() # innovation covariance
    K = n.var() / S # Kalman gain
    error = o.mean() - n.mean() # Innovation

    mu = n.mean() + K * error
    var = (1 - K) * n.var()
    # Numerator of the weight update
    weight = Pd * w * norm.pdf(o.mean(), n.mean(), S)
    w_sum += weight
    tmp_gm.append((weight, norm(mu, sqrt(var))))

  # Division for weights
  for w, n in tmp_gm:
    weight = w / (c_z + w_sum)
    kalman_updated_gm.append((weight, n))

Figure 33: Kalman update using observation

Third update

Take into account potentially missed objects

Multiply the predicted PHD by \(1- P_D\) to add the possibility of missed objects:

Figure 34: Add components to represent the miss-detection hypotheses

Third update

Add previous PHDs

Figure 35: Updated PHD

Infos:

Number of components: 69, sum of the weights: 2.058122621541668

Approximation using Gaussian Mixtures

We model the PHD and the measurements as gaussian mixtures, ie. a weighted sum of gaussian distributions.

Prediction

Update

Update

Limits of the filter

  • increasing number of components (\(\times (m_k + 1)\)) at each update => need to prune or merge components;
  • not very robust to missdetections (no “memory” on \(N\));
  • approximate the targets as identically distributed.

Other (better) filters exist, like the CPHD, the MB-RFS filter, the PMBM filter…