On-Device Deep Learning for IoT-based Wireless Sensing Applications

Manoj Kumar Lenka and Ayon Chakraborty @ SENSE Lab, IIT Madras

[Datasets] [Models] [Scripts] [Workshop Paper] [Artifact Paper] [Presentation Slides]

Citations

In case you use the data cite us via the following references:
Workshop
    @inproceedings{lenka2024wisdom,
        title={On-Device Deep Learning for IoT-based Wireless Sensing Applications},
        author={Lenka, Manoj and Chakraborty, Ayon},
        booktitle={2024 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops)},
        location={Biarritz, France},
        year={2024}
    }    
            
Artifact
    @inproceedings{lenka2024wisdomartifact,
        title={ARTIFACT: On-Device Deep Learning for IoT-based Wireless Sensing Applications},
        author={Lenka, Manoj and Chakraborty, Ayon},
        booktitle={2024 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops)},
        location={Biarritz, France},
        year={2024}
    }    
            

Abstract

Recent innovations in Wi-Fi sensing capitalizes on a host of powerful deep neural network architectures that make inferences based on minute spatio-temporal dynamics in the wireless channel. Many of such inferencing techniques being resource intensive, conventional wisdom recommends offloading them to the network edge for further processing. In this work, we argue that edge based sensing is often not a viable option for many applications (due to cost, bandwidth, latency etc). Rather, we explore the paradigm of on-device Wi-Fi sensing where inferencing is carried out locally on resource constrained IoT platforms. We present extensive benchmark results characterizing the resource consumption (memory, energy) and the performance (accuracy, inferencing rate) of some typical sensing tasks. We propose WISDOM, a framework that, depending on capabilities of the hardware platform and application's requirements, can compress the inferencing model. Such context aware compression aims to improve the overall utility of the system - maximal inferencing performance at minimal resource costs. We demonstrate that models obtained using the Wisdom framework achieve higher utility compared to baseline models that are just quantized for 83% of the cases. While for non-compressed models it has higher utility 99% of the time. (Video abstract TBA)

Wireless Multipath

What is wireless sensing?

When a signal is sent from the transmitter (TX) to the receiver (Rx). The Rx gets several copies of the same signal from the Tx due to reflection, refraction, scattering, etc against the surrounding objects. Each copy is received at different time delays with different amplitudes, this results in a channel impulse response (CIR). We then convert the time-domain CIR to frequency domain using Fourier transform, and get a channel frequency response (CFR). We then stack the CFRs from multiple packets to get a channel state information (CSI) spectrogram, which can be used as a sensing modality. The CSI sensing modality is then used to create a dataset for human activity recognition.

Edge Challenges

Why perform on-device wireless sensing?

In cloud/edge based sensing, several devices are connected to an access point wireless (AP) via. WiFi. The AP is further connected to a cloud/edge servers via. the internet. We send two kinds of data: the user data and the sensing data (which in our case are the CSI spectrogram). The sensing data consumes some part of the bandwidth, which reduces the QoS of other devices i.e., increase ping latency, decrease download speed, etc. Further, sending data to cloud or edge has its set of challenges like - requiring last mile connectivity, increase latency, increase in OPEX cost of running services on remote servers, privacy concerns, etc. Hence we would like to perfom sensing on the device itself, which will end the need to send the CSI spectrograms further. But, the IoT devices usually have low resources, which makes it difficult to perform sensing tasks using current deep learning based techniques commonly found in literature.

On-Device Challenges

Challenges of performing on-device sensing?

Two straightforward methods to tackle the low resources of IoT devices are - use more provision devices or use smaller (more compressed) deep learning models. But, both have their problems. Using more provisioned devices would also increase energy consumption, and as most IoT devices are battery powered it would have an adverse effect. While using smaller/compressed models decreases accuracy non-trivially. For instance, we train a state-of-the-art Convolutional Neural Network (CNN) based model on CSI data for classifying human activities and evaluated it to be fairly robust and reasonably accurate (about 95%). However, we fail to deploy the model in as many as 75% (15 out of 20) of our representative test devices. For a moderate to heavy compression, we were able to deploy the compressed models in about 50-75% of the test devices, however the classification accuracy in such cases takes a significant hit. Brining the accuracy down to only 50%.

Our purposed solution

Design a framework that chooses a neural network and then compresses it, for a Wi-Fi sensing application such that the user can tune the trade off between performance and cost. In order to determine the cost and performance we look into five metrics. For perfomance - accuracy of the infercing model and the rate which inferences are generated. For cost - runtime memory (RAM) needed, flash consumed to store the sensing application, and the energy consumed per inference made. We call our framework as WISDOM.

Wisdom Inputs
Inputs to WISDOM

Inputs to WISDOM are the user preferences in terms of the weights (priority) given to each of the metric. These weights are based on the application requirement, for e.g., we might want higher accuracy and lower energy consumption but are okay comprimising the inference rate. The weights are reletive to each other. These are further constrained by some filters (bounds). That retrict the minimum or maximum value of the metrics, for e.g., the microcontroller where the model is running might have limited RAM and flash memory.

Wisdom Inputs
Outputs from WISDOM

WISDOM tries to provide the model config with the higest utility - performance minus the cost. The model config describes a neural network using three parameters - artitecture, number of parameters and the compression technique used. In our eximeriments different combinations of these parameters gives us around 300 models to choose from.

Wisdom Inputs
Utility function

Utility is defined as the performance minus the cost. This results in a metric that intitially increases with increase in cost, but later starts droping because the increase in cost does not justify the increse in perfomance. Here the performance is determined by the weighted sum of accuracy and inferencing rate. The cost is similarly determined by the weighted sum of runtime memory, flash memory and energy per inference. Further, we put constrains on all the metrics. Providing lower bounds for the performance metrics, and upper bound on the cost metrics. Overall, the utility function takes a model as input, and provides us with a real number that denotes the utility of that model. We also normalize the values of the metrics to enable fair comparisons.

Wisdom Inputs
Optimization problem

Given the weigths and constrains, we get a unique utility function. Now our goal is to ensure that the utility of the model generated by WISDOM framework given the same weights and constrains tends to the utility value of the most optimal model. Further, the optimal model is the model with the highest utility value for the given weights and constrains from the set of all possible models (around 300, as we had defined in outputs). We empirical solve this optimization problem using a decision tree. The inputs to the decision tree are the weights and constrains. The output is the model whose utility value is as close to the utility value of the optimal model. Therefore are samples are different permutations of weights and constrains. While our ground truth is the optimal model (with highest utility value). It is fairly simple to generate different combinations of weights and constrains using a script (we generated around 27K). But, to find the optimal model we need to find the utility value for all the models and then choose the maximum amongst them. In order to find the utility value for all models we need to measure the metrics (accuracy, rate, RAM, flash and energy) for each.

Wisdom Inputs
Measuring the metrics

We first train and them compress the different deep learning model in a well provisioned system. These compressed models are then deployed to a microcontroller (ESP32-C3-MINI in our case). During the deployment TFLite library provides with information regarding flash and RAM required. Once the model is deployed, it starts sensing on the device, and sends its inference back to the system. Using these inferences we measure the accuracy. Finally while sensing is happening in the microcontroller, we also measure the current drawn by the device. We do this using a prower profiler kit (PPK II) connected to the microcontroller. Using the current measuremtns we get a sense of the energy required and the inferencing rate.

Results

WISDOM choses a model better than the best quantized model (amongst all the other models) 83% of time, which shows that quantization of weigths is always enough. It also chooses a better model than the best non-compressed model 99% of time, which shows that choosing model based solely on parameter size is not feasible. More results can be found in the paper.