Finite Set Statistics for Multiple-Object Tracking

Single Object Tracking

Single Object Tracking
“Simple” Multiple Object Tracking
Random Finite Sets
The Probability Hypothesis Density Filter

Recall on Bayes Rule

\[ P(A \vert B) = \frac{ P(A) P(B \vert A) }{ P(B) } \qquad(1)\]

\[ P(A \vert B) = \frac{ P(A) P(B \vert A) }{ \sum_A P(A) P(B \vert A) } \]

Using PDFs :

\[ p(x \vert y) = \frac{ p(x) p(y \vert x) }{ \int p(x) p(y \vert x)dx } \]

State estimation using Bayes Rule

If we have a prior on a state (eg. localization) \(p(x_k \vert z^{k-1})\), we can take into account some observations \(z_k\) using the likelihood function

\[ p(x_k \vert z^k) = \frac{ p(x_k \vert z^{k-1}) g(z_k \vert x_k) }{ \int p(x_k \vert z^{k-1}) g(z_k \vert x_k)dx_k } \]

We can compute the prior for the time \(k\) from time \(k-1\) using the Chapman-Kolmogorov equation :

\[ p(x_k \vert z^k) = \frac{ \int f(x_k \vert x_{k-1})p(x_{k-1} \vert z^{k-1})g(z_k \vert x_k) dx_{k-1} }{ \iint f(x_k \vert x_{k-1})p(x_{k-1} \vert z^{k-1})g(z_k \vert x_k) dx_{k-1}dx_k } \]

A particular case - the Kalman Filter

Assuming linear evolution and observation model and gaussian centered noises, we have

\[ p(x_{k} \vert z^{k-1}) = \mathcal{N}(x_{k}; \mu_{k \vert k-1}, P_{k \vert k-1}) \]

and

\[ g(z_k \vert x_k) = \mathcal{N}(z_k; H_k x_k, R_k) \]

The posterior density obtained by the Kalman update is the solution of the Bayes rule with previous assumptions:

\[ p(x_k \vert z^k) = \frac{ \mathcal{N}(x_{k}; \mu_{k \vert k-1}, P_{k \vert k-1}) \mathcal{N}(z_k; H_k x_k, R_k) }{ \int \mathcal{N}(x_{k}; \mu_{k \vert k-1}, P_{k \vert k-1}) \mathcal{N}(z_k; H_k x_k, R_k)dx_k } \]

Object Tracking using KF

Scenario

Toy example : a single target moving along 1-D, with an associated constant velocity model :

\[ X_k = \begin{pmatrix} x_k \\ v_k \end{pmatrix},\ X_{k+1} = \begin{pmatrix} 1 & \Delta t \\ 0 & 1 \end{pmatrix} X_k \]

We only observe \(x\), so the observation model is :

\[ z_k = \begin{pmatrix} 1 & 0 \end{pmatrix} X_k \]

Assume process noise \(Q = \begin{pmatrix}0.1 & 0 \\ 0 & 0.2\end{pmatrix}\) and observation noise \(R = 0.3\)

Object Tracking using KF

Prior

Object Tracking using KF

Prediction 1

Object Tracking using KF

Observation 1

Figure 3: Prediction of \(X\) and first observation \(z\)

Object Tracking using KF

Update 1

Figure 4: Kalman update using prediction and observation

Object Tracking using KF

Prediction 2

Figure 5: Update result and prediction of next timestep

Object Tracking using KF

Observation 2

Figure 6: Prediction and second observation

Object Tracking using KF

Update 2

Figure 7: Kalman update using prediction and observation

“Simple” Multiple Object Tracking

Single Object Tracking
“Simple” Multiple Object Tracking
Random Finite Sets
The Probability Hypothesis Density Filter

Problematic

How to track a known and constant number of targets?

Figure 8: A problem of multiple object tracking

Conceptual solution: use 1 KF per track, associate measurements to tracks (NN, GNN,…).

Prediction

Observations and association

Update

Limits

How to deal with

unknown / time-varying number of tracks;
missdetections;
clutter measurements (False Positives);
the confidence into the existence of the targets?

Random Finite Sets

Single Object Tracking
“Simple” Multiple Object Tracking
Random Finite Sets
The Probability Hypothesis Density Filter

What is a Finite Set?

A set is a collection of different things, called its elements. Sets are unordered.

\(X = \{\} = \emptyset\) is the empty set.

\(X = \{x\}\) is a singleton.

They may or may not contain an infinite number of elements.

A Finite Set is a set containing a finite number of elements.

The number of elements is the cardinality of the set, denoted \(\lvert X \rvert\)

Figure 12: Example of a *set of polygons*

Random Finite Set

Definition

A Random Finite Set (RFS) is a set containing a random number of random variables ie,

\[ X = \left\{ x_1, \dots, x_n \right\} \]

where both the cardinality \(n\) and the \(x_i\) are randomly distributed.

Elements of a RFS

The \(x_i\) may be random vectors of any dimension, for example the position of a target \(\begin{pmatrix}x \\ y \end{pmatrix}\) or its pose \(\begin{pmatrix}x \\ y \\ \theta \end{pmatrix}\).

Set Integral, PDF

Multi-object PDF \(\pi(X) \geq 0\), s.t.

\[ \int \pi(X)dX = 1 \]

based on the set integral ¹

\[ \int \pi(X) = \sum_{i=0}^{\infty}\rho(i) \]

With \(\rho(i)\) the cardinality distribution of \(X\):

\[ \rho(i) \triangleq \frac{1}{i!}\int\pi(x_1, \dots, x_i)dx_1 \dots dx_i \]

Common Random Finite Sets

Bernoulli RFS

\[ \pi(X) = \begin{cases} 1 - r & \text{if } X = \emptyset \\ r \cdot p(x) & \text{if } X = \left\{ x \right\} \\ 0 & \text{otherwise} \end{cases} \]

Defined by \((r, p)\)
Represents an element with existence probability \(r\) and distribution \(p\)
Cardinality is either 0 or 1.

Common Random Finite Sets

Multi-Bernoulli (MB)-RFS

Union of \(M\) independent Bernoulli RFS
Defined by \(\left\{ \left( r^{(i)}, p^{(i)} \right) \right\}_{i=1}^M\)
\(0 \leq \lvert X \rvert \leq M\)

Common Random Finite Sets

Poisson RFS

Also called Poisson Point Process

\(\rho(n) \sim \mathcal{P}(\bar\lambda)\)
\(x_i\) i.i.d \(\sim p(\cdot)\)
\(\pi(X) = e^{-\bar\lambda} \bar\lambda^n p(x_1)\dots p(x_n)\)

Commonly used to model clutter measurements (false positive).

Intensity Function of a Poisson RFS

The intensity function of a Poisson RFS is defined as

\[ \lambda(x) = \bar\lambda \cdot p(x) \]

We can get back \(\bar\lambda\) and \(p(\cdot)\)

\[ \bar\lambda = \int\lambda(x)dx \]

\[ p(x) = \frac{\lambda(x)}{\bar\lambda} = \frac{\lambda(x)}{\int\lambda(x)dx} \]

Realizations of a Poisson RFS

Example:

\(\rho(i) \sim \mathcal{P}(4)\)
\(p \sim U(-2, 2) \times U(-2, 2)\)

Figure 13: Realizations of a Poisson RFS

But, why Random Finite Sets?

Because they express both the uncertainty about the number of targets and the targets themselves.
Because we can use the Bayes rule with them:

\[ p(X_k \vert Z^k) = \frac{ p(X_k \vert Z^{k-1}) g(Z_k \vert X_k) }{ \int p(X_k \vert Z^{k-1}) g(Z_k \vert X_k) dX_k } \]

\(g(Z_k \vert X_k)\) is the multi-object likelihood.

Multiple object tracking using RFS

Let’s use a more complicated model:

unknown and time varying number of targets;
detection probability \(P_D(x) \leq 1\);
clutter modeled as a PPP with intensity function \(\lambda_c(\cdot)\);
measurements of one target is indep. from other measurements and targets;
each measurement is the result of at most one object.

Use RFS to represent the states (\(X_k\)) and the measurements (\(Z_k\)).

Motion Model

=> Compute both the update of existing elements, and birth of new objects.

Measurement Model

\(Z_k = \left\{ z_k^1, \dots, z_k^{m_k} \right\}\) and \(X_k = \left\{ x_k^i, \dots, x_k^{n_k} \right\}\)

Observations \(Z_k = C_k \cup O_k\)

Using the association hypothesis vector \(\theta_k = \left[ \theta_k^1, \dots, \theta_k^{n_k} \right]\) where

\[ \theta_k^i = \begin{cases} j & \text{if object } i \text{ is associated to measurement } j \\ 0 & \text{if objetc } i \text{ is undetected} \end{cases} \]

The observation model is

\[ p(Z_k \vert X_k) = \sum_{\theta_k} \left( \exp(-\bar\lambda_c) \prod_{j=1}^{m_k}\lambda_c(z_k^j)\prod_{i:\theta^i=0}(1 - P_D(x_k^i)) \\ \times \prod_{i: \theta^i \gt 0}\frac{ P_D(x_k^i)g_k(z_k^{\theta^i} \vert x_k^i) }{ \lambda_c(z_k^{\theta_k^i}) } \right) \]

Measurement Model - Other Form

\[ p(Z_k \vert X_k) = p_{c_k}(Z_k)g_k(\emptyset \vert X_k)\sum_{\theta_k}\prod_{i:\theta_k^i \ge 0}\frac{ P_D(x_k^i)g_k(z_k^{\theta_k^i} \vert x_k^i) }{ \lambda_c(z_k^{\theta_k^i})(1 - P_D(x_k^i)) } \]

with

\[ p_{c_k}(Z_k) = \exp(-\bar\lambda_c)\prod_{s=1}^{m_k}\lambda_c(z_k^s) \]

\[ g_k(\emptyset \vert X_k) = \prod_{j=1}^{n_k}(1 - P_D(x_k^j)) \]

Calculability

The previous exact equations are not calculable in practice => need to introduce some approximations.

The Probability Hypothesis Density Filter

Single Object Tracking
“Simple” Multiple Object Tracking
Random Finite Sets
The Probability Hypothesis Density Filter

The PHD function

The Probability Hypothesis Density \(D_X(x)\) of a RFS \(X\) is defined as:

\[ \begin{aligned} D_X(x) & = \int p_X(X)\sum_{x' \in X}\delta(x-x')dX \\ & = \int p_X(\left\{ x \right\} \cup X)dX \end{aligned} \]

It is the first-order moment of the RFS, also called intensity function of the RFS.

\[ \int_A D_X(x)dx = \mathbb{E}\left[\left| X \cap A \right|\right] \]

=> Expected number of elements in the subspace \(A\).

PHD of a Bernoulli RFS

If \(X\) is a Bernoulli RFS parametrized by \(r\) and \(p(\cdot)\), its PHD is

\[ D_X(x) = r \cdot p(x) \]

Example with \(r = 0.75\) and \(p(x) = \mathcal{N}(x; 1.2, 0.78)\)

PHD of a Multi-Bernoulli RFS

If \(X\) is a Multi-Bernoulli RFS parametrized by \(\left\{ r^{(i)}, p^{(i)}(\cdot) \right\}_{i=1}^{N}\), its PHD is

\[ D_X(x) = \sum_{i=1}^{N} r^{(i)}p^{(i)}(x) \]

Example 1

MB-RFS with 2 components \((1, \mathcal{N}(-2.3, 1.4))\) and \((1, \mathcal{N}(1.2, 0.85))\)

Figure 15: PHD of a Multi-Bernoulli RFS - eg. 1

Example 2

MB-RFS with 2 components \((0.95, \mathcal{N}(-2.3, 1.4))\) and \((0.65, \mathcal{N}(1.2, 0.85))\)

Figure 16: PHD of a Multi-Bernoulli RFS - eg. 2

PHD of a Poisson RFS

If \(X\) is a Poisson RFS with intensity function \(\lambda(\cdot)\), its PHD is

\[ D_X(x) = \lambda(x) \]

Approximation of a RFS using the PHD

As the PHD is the exact representation of a Poisson RFS, one may approximate any RFS by a Poisson RFS parametrized by its PHD.

Example

Let \(X\) a MB-RFS with 2 components \((0.95, \mathcal{N}(-2.3, 1.4))\) and \((0.65, \mathcal{N}(1.2, 0.85))\).

\[ \begin{aligned} D_X(x) & = 0.95 \mathcal{N}(x; -2.3, 1.4) + 0.65 \mathcal{N}(1.2, 0.85) \end{aligned} \]

\(X\) can be approximated by a PPP with intensity function \(\lambda(x) = D_X(x)\)

PHD Filter - Toy Example

2 features located at \(-1.0\) and \(1.0\) along the x axis.
The y axis is such that \(y(x) = PHD(x)\).

The prior is composed of two gaussians \(\mathcal{N}(-1.02, 0.3^2)\) and \(\mathcal{N}(1.1, 0.4^2)\) with respective weights \(0.85\) and \(0.92\).

First update

Observations

Suppose we observe one object at \(0.99\), but we miss the object at \(-1\). The noise model is set at \(R = 0.15\).

We can plot the observation above the prior:

First update

Kalman update

Perform a Kalman update for each component of the prior, using the observation :

Code

# There is only one observation => no loop on observation set

w_sum = 0
tmp_gm = []
kalman_updated_gm = []

# decompose component as weight + normal
for w, n in prior_gm:
  S = R + n.var() # innovation covariance
  K = n.var() / S # Kalman gain
  error = o_1.mean() - n.mean() # Innovation

  mu = n.mean() + K * error
  var = (1 - K) * n.var()
  # Numerator of the weight update
  weight = Pd * w * norm.pdf(o_1.mean(), n.mean(), S)
  w_sum += weight
  tmp_gm.append((weight, norm(mu, sqrt(var))))

# Division for weights
for w, n in tmp_gm:
  weight = w / (c_z + w_sum)
  kalman_updated_gm.append((weight, n))

Figure 19: Kalman update using observation

First update

Take into account potentially missed objects

Multiply the predicted (prior) PHD by \(1- P_D\) to add the possibility of missed objects:

Figure 20: Add component to represent miss-detection hypotheses

First update

Add the previous PHDs to finish the update

Prediction

Propagate previous update result

Scale by \(P_S\) the probability of survival :

Figure 22: Predicted PHD from previous step

Prediction

Birth model

Create a component for each observation in previous step. The birth model is follows a \(\mathcal{N}(\mu, R)\) with weight \(0.2\) (maybe too high in practice but fine for visualization).

Prediction

The predicted PHD is the sum of the propagated PHD and the birth PHD.

Second update

Observations

Suppose this time we have the following observations (2 OK + clutter) :

Figure 25: Predicted PHD and observation

Second update

Kalman update

Update each gaussian component with each observation.

Code

kalman_updated_gm = []

for o in obs:
  w_sum = 0
  tmp_gm = []

  # decompose component as weight + normal
  for w, n in predicted_phd:
    S = R + n.var() # innovation covariance
    K = n.var() / S # Kalman gain
    error = o.mean() - n.mean() # Innovation

    mu = n.mean() + K * error
    var = (1 - K) * n.var()
    # Numerator of the weight update
    weight = Pd * w * norm.pdf(o.mean(), n.mean(), S)
    w_sum += weight
    tmp_gm.append((weight, norm(mu, sqrt(var))))

  # Division for weights
  for w, n in tmp_gm:
    weight = w / (c_z + w_sum)
    kalman_updated_gm.append((weight, n))

Figure 26: Kalman update using observation

Second update

Take into account potentially missed objects

Multiply the predicted PHD by \(1- P_D\) to add the possibility of missed objects:

Figure 27: Add components to represent miss-detected hypotheses

Second update

Add previous PHDs

Prediction

Propagate previous update result

Scale by \(P_S\) the probability of survival :

Figure 29: Predicted PHD from previous step

Prediction

Birth model

Create a component for each observation in previous step. The birth model is follows a \(\mathcal{N}(\mu, R)\) with weight \(0.1\).

Prediction

The predicted PHD is the sum of the propagated PHD and the birth PHD.

Third update

Observations

Suppose this time we have the following observations (2 OK with bigger noise for the 2nd) :

Figure 32: Predicted PHD and observation

Third update

Kalman update

Update each gaussian component with each observation.

Code

kalman_updated_gm = []

for o in obs:
  w_sum = 0
  tmp_gm = []

  # decompose component as weight + normal
  for w, n in predicted_phd:
    S = R + n.var() # innovation covariance
    K = n.var() / S # Kalman gain
    error = o.mean() - n.mean() # Innovation

    mu = n.mean() + K * error
    var = (1 - K) * n.var()
    # Numerator of the weight update
    weight = Pd * w * norm.pdf(o.mean(), n.mean(), S)
    w_sum += weight
    tmp_gm.append((weight, norm(mu, sqrt(var))))

  # Division for weights
  for w, n in tmp_gm:
    weight = w / (c_z + w_sum)
    kalman_updated_gm.append((weight, n))

Figure 33: Kalman update using observation

Third update

Take into account potentially missed objects

Multiply the predicted PHD by \(1- P_D\) to add the possibility of missed objects:

Figure 34: Add components to represent the miss-detection hypotheses

Third update

Add previous PHDs

Infos:

Number of components: 69, sum of the weights: 2.058122621541668

Approximation using Gaussian Mixtures

We model the PHD and the measurements as gaussian mixtures, ie. a weighted sum of gaussian distributions.

Prediction

Update

Limits of the filter

increasing number of components (\(\times (m_k + 1)\)) at each update => need to prune or merge components;
not very robust to missdetections (no “memory” on \(N\));
approximate the targets as identically distributed.

Other (better) filters exist, like the CPHD, the MB-RFS filter, the PMBM filter…