Lecture 4 - Part 2

Contents

Introduction

In this chapter, we will first explore an interactive practical demo to see applicability of what we have learn so-far. We will then zoom out a bit and understand practical workflow involved in implementing detection on the IoT devices. Lastly, we will cover some advanced aspects of IoT when we are no-longer concerned about point observations and indeed want to capture spatio-temporal phenomenon using network of sensors.

Practical Aspects

To demonstrate how the theoretical concepts you have studied so far connect to the practice, let us consider a simple setup. We have ESP32 micro-controller, with a Grove breakout kit. Grove is a connector system which is a derivative of JHT connectors. The four pin connectors provide easy way of connecting the sensor boards with the kit. We will be connecting BME680 sensor with the ESP32. The BME680 is an Integrated sensor from Bosch designed for IoT and wearable IoT application, with size and low power consumption in mind. It provides environment sensing of temperature, humidity, pressure and gas. The gas measurements are approximate indicator of VOC. Details can be found at here .
Demo:

To test this demo, you need the board connected to the PC. You also need the right firmware on the board. The firmware tab has all the firmware detail. Once you have got the main.py and the drivers for BME680 on the board, you will be able to use I2C protocol to read sensor observations. A simple demo would involve detecting when sensor is hot vs. when it is at normal room temperature. To collect the data, first make connection and then press "Start". Once you have enough samples for "H₀" then from drop down switch to "H₁" and simultaneously either bring heat source close to sensor or spray a perfume etc. Once you are happy with number of samples under alternative hypothesis press stop and run the python script. You will be able to see the histogram.Now using the PDF you can fit a theoretical PDF and workout the NP criterion. The python editor is only enabled when you have data. You can see the readings for the data in the console tab.

Zooming Out

Having looked at the demo, now let us zoom out slightly. Let us understand the workflow for implementing the optimum detection, in a sense that its constant false alarm rate (CFAR). The flow chart below shows the steps involved. As you will observe, there are three stages. Data collection, determination of the optimal threshold, and lastly use of this threshold for inference.

Threshold-Based Detection

NP Criterion / Threshold Determination

Sample Collection

Yes

No

Start ESP32

Initialize ADC and Sensors

Select Hypothesis H₀ or H₁

Collect Samples

Store or Transmit Data

Receive Data on PC / NP Processing

Separate H₀ and H₁ Samples

Estimate Statistics: PDF, mean, etc.

Calculate Threshold for Desired Pfa

Upload Threshold to ESP32

ESP32 Measures New Data

Compare Sample to Threshold

Sample >= Threshold?

Decide H₁

Decide H₀

Output Decision / Trigger Action

Same workflow also applies when we have regression problem on the sensor observation rather than the detection.

Stage 1: Data Collection

Stage 2: Regression

Stage 3: Prediction

Regression is one of the methods for time-series analysis and you will study this in quite some details within your ML courses. We will also cover some aspects of time-series processing in next few chapters.

Networked Detection Perspective

1. Introduction

In distributed detection, multiple sensors are deployed across a field to observe a physical phenomenon, such as temperature, moisture, or gas concentration. The goal is to combine these observations to make a reliable global decision.

Field Phenomenon Observation Diagram:

Field / Environment

Sensor 1

Sensor 2

Sensor 3

Fusion Center

Global Decision H

Sensors measure local signals which may be noisy, and send either raw measurements or local decisions to a central fusion center.


2. Problem Formulation

Binary hypothesis testing:

H0:Event absent,H1:Event presentH_0: \text{Event absent}, \quad H_1: \text{Event present}

Each sensor ii observes xix_i and reports either:

  • Raw observation xix_iData Fusion
  • Local decision ui{0,1}u_i \in \{0,1\}Decision Fusion

The fusion center makes a global decision HH.


3. Data Fusion (Fusion of Raw Observations)

Data Fusion refers to the process where each sensor sends its raw measurement directly to a central fusion center. The fusion center then processes all the sensor observations collectively to make a global decision. This method is considered optimal because it retains all available information from the sensors, allowing the fusion center to compute the likelihoods of different hypotheses based on the full set of data.

Advantages of data fusion:

  1. Optimal performance: Uses all available information.
  2. Noise weighting: Sensors with lower noise can be given higher weights.
  3. Flexibility: Supports likelihood-based methods, Bayesian fusion, or weighted averaging.

Disadvantages:

  1. High bandwidth requirement: Raw sensor data must be transmitted.
  2. Centralized processing: Fusion center must handle all computations.

In essence, data fusion leverages the complete statistical information from all sensors to improve the reliability of the global decision, especially when sensor noise characteristics are known.

3.1 Ideal Scenario: Data Fusion

In the ideal scenario, all sensors provide perfect, noise-free measurements. Each sensor sends its raw observation to the fusion center, which can then perform exact statistical processing without any uncertainty due to sensor noise.

  • Perfect measurements: Sensor readings exactly reflect the underlying signal.
  • Optimal likelihood ratio test (LRT):
Λ(x1,x2,...,xN)=p(x1,x2,...,xNH1)p(x1,x2,...,xNH0)η\begin{equation} \Lambda(x_1, x_2, ..., x_N) = \frac{p(x_1, x_2, ..., x_N | H_1)}{p(x_1, x_2, ..., x_N | H_0)} \gtrless \eta \end{equation}
  • Outcome: Fusion center can make error-free or near-perfect global decisions.
  • Bandwidth requirement: All raw data must be transmitted.

Sensor 1: x1 ideal

Fusion Center compute LRT

Sensor 2: x2 ideal

Sensor 3: x3 ideal

Global Decision H

3.2 Noisy Scenario: Data Fusion

In real-world scenarios, sensor measurements are often corrupted by noise. Each sensor observes:

xi=s+ni,niN(0,σi2)\begin{equation} x_i = s + n_i, \quad n_i \sim N(0, \sigma_i^2) \end{equation}
  • Effect of noise: Raw sensor readings are no longer perfect, reducing the discriminative power of the global likelihood ratio.
  • Weighted fusion: Fusion center can assign weights based on noise variance:
Λ=i=1Nwixiwi1σi2\begin{equation} \Lambda = \sum_{i=1}^{N} w_i x_i \qquad w_i \propto \frac{1}{\sigma_i^2} \end{equation}
  • Outcome: Data fusion remains optimal in a statistical sense, but accuracy depends on noise levels and correct weighting.
  • Bandwidth requirement: Still requires all raw data to be transmitted.

Sensor 1: x1 noisy

Fusion Center weighted LRT

Sensor 2: x2 noisy

Sensor 3: x3 noisy

Global Decision H


4. Decision Fusion (Fusion of Local Decisions)

Decision refers to the process where each sensor makes a local decision (e.g., a 1-bit binary decision) based on its own observation and sends only this decision to the fusion center. The fusion center then combines these local decisions using rules such as majority voting, OR, or AND to reach a global decision.

Advantages:

  1. Low bandwidth: Only 1-bit decisions per sensor need to be transmitted.
  2. Simplicity: Reduces computational load at the fusion center.

Disadvantages:

  1. Suboptimal performance: Some information is lost because the raw measurements are not transmitted.
  2. Sensitivity to local errors: If local sensors are noisy, their decisions can degrade global performance.

In practice, decision fusion is often preferred in resource-constrained networks where bandwidth and energy consumption are critical, but it is generally less accurate than data fusion because it cannot exploit the full statistical information from the sensors.

4.1 Ideal Observations

Sensor 1: u1

Fusion Center Majority OR AND

Sensor 2: u2

Sensor 3: u3

Global Decision H

In the ideal scenario, each sensor makes a perfect local decision:

  • Local detection probability: PD,i=1P_{D,i} = 1
  • Local false alarm probability: PF,i=0P_{F,i} = 0

OR Rule

  • Global misdetection probability:
PMOR=P(all sensors miss)=i=1N(1PD,i)\begin{equation} P_{M}^{OR} = P(\text{all sensors miss}) =\prod_{i=1}^{N} (1 - P_{D,i}) \end{equation}
  • Description: Fusion center declares event present if any sensor detects it; minimizes misdetection. Also, equivalent to the complement of the union of detection events:
PMOR=1P(i=1NSensor i detects)P_{M}^{OR} = 1 - P\Big(\bigcup_{i=1}^N \text{Sensor i detects}\Big)

AND Rule

  • Global misdetection probability:
PMAND=1i=1NPD,i\begin{equation} P_{M}^{AND} = 1 - \prod_{i=1}^{N} P_{D,i} \end{equation}
  • Description: Fusion center declares event present only if all sensors detect it; conservative approach.

Majority Voting Rule

  • Global misdetection probability:
PMMajority=k=0N/2(Nk)PDk(1PD)Nk\begin{equation} P_{M}^{Majority} = \sum_{k=0}^{\lfloor N/2 \rfloor} \binom{N}{k} P_D^k (1-P_D)^{N-k} \end{equation}
  • Description: Event is declared present if more than half the sensors detect it; balanced rule between OR and AND.

4.2 Noisy Observations

Noise increases local errors (PD,iP_{D,i} decreases, PF,iP_{F,i} increases)

Sensor 1: u1 noisy

Fusion Center Majority OR AND

Sensor 2: u2 noisy

Sensor 3: u3 noisy

Global Decision H


5. Comparison Table

FeatureData FusionDecision Fusion
Data transmittedRaw observationsLocal 1-bit decisions
PerformanceOptimal idealSuboptimal
Noise handlingWeighted fusion improves robustnessOnly mitigated via fusion rule or soft decisions
BandwidthHighLow

6. Summary

  • Distributed detection uses multiple sensors to improve reliability.
  • Data fusion is optimal but bandwidth-intensive.
  • Decision fusion is simpler and low-bandwidth but suboptimal.
  • Noise reduces accuracy; data fusion can partially compensate via weighting, decision fusion suffers more from local errors.

References

[1] Ross, "Signal Processing for Communications", EPFL Press, 2008.