Authenticating ‘low-end wireless sensors’ with deep learning + SDR

5 min readAug 3, 2019

Machine learning with RF data. Software/firmware identities can be spoofed. An un-spoofable physical identity is needed.

Any device that emits radio waves (i.e. a radio) has a unique RF fingerprint. RF fingerprints are attributable to slight variations in hardware manufacturing tolerances i.e. no two transmitters are the same, each one emits a slightly different signal from the other, regardless of the make or model.

The reason: RF transmitters are composed of analog components such as digital-to-analog converters, band-pass filters, frequency mixers, and power amplifiers. A tiny variation in any of these hardware primitives will yield a slightly altered radio signal. In technical terms, ‘manufacturing variations’ affect the following properties of an RF signal — I/Q imbalance, phase noise, carrier frequency, and phase offset, harmonic distortions, power amplifier distortions, yielding unique RF fingerprints.

A simplified block diagram of an RTL-SDR (i.e. a radio ) and its components.

So, if every transmitter is bound to be different from the other, can we not make use of these un-spoofable distinguishing markers to uniquely identify a radio transmitter. In short, the answer is yes.

Machine Learning in Finance | Data Driven Investor

Before we cover some Machine Learning finance applications, let's first understand what Machine Learning is. Machine…

www.datadriveninvestor.com

We can uniquely identify a radio transmitter by analyzing its radio transmissions/signals.

With that context, let's get to the whole point of this blog.

The Goal:

Explore the feasibility of an authentication system to uniquely identify a speciﬁc radio via its RF transmissions and provide an additional layer of security for low-end wireless devices.

Applications for short range RF authentication (like say-car key fobs) are numerous but broadly speaking, we could use such an authentication scheme as an additional check in ‘any IoT’ use-case, where an edge device or regular router with limited compute power authenticates a bunch of wireless sensors.

Conventional RF fingerprinting involves the detection of a transient or steady-state signal and the extraction of the ﬁngerprint i.e. build a database of RF fingerprints and use it to uniquely identify a transmitter. This process of manually extracting features can take time, requires detailed knowledge of the signals and can get complicated depending on the RF characteristics you’re trying to fingerprint, especially for low-end edge devices.

But if you think about it, fingerprinting is mostly about learning fine-grained patterns in hardware-specific imperfections. So, all you probably need is a good pattern detector, like a deep-learning neural network.

Machine learning has had remarkable success in image recognition with breakthrough advancements in deep-learning-based algorithms, chief among them being — Deep Convolutional Neural Networks. DCNNs (for short) pretty much form the backbone of many modern computer vision systems. But at their core, CNNs are just very good feature extraction engines (or in other words, they’re very good at pattern recognition). So, I thought to myself- why not apply DCNNs to our problem?

To my surprise, I stumbled upon some research with promising results in this exact area — http://www.ece.neu.edu/fac-ece/ioannidis/static/pdf/2018/radio_identification.pdf

What follows is an attempt at implementing a PoC based on this research. Preliminary observations — it classifies 2 of my test emitters with 97% accuracy, especially at distances of 10 ft or shorter (with cheap hardware).

The set-up:

Deep learning libraries: TensorFlow as my backend and TFlearn as the high-level API
For raw radio data collection: An RTL-SDR with a pretty basic antenna, pyRtlsdr, and numpy libraries and 2 standard garage door remotes operating at 433mhz
Programming environment: Visual studio code, github
Miscellaneous stuff: A bit of SDR know-how, a little bit of experimentation on a Jupyter notebook and a couple of hours of uninterrupted peace.

Steps:

Collect raw radio data (I/Q samples) over multiple transmissions via the RTL-SDR hardware and pyRtlsdr library
Label, prepare and store your data in a format that’s consumable by your neural network via the numpy library
Define your neural network with tensorflow’s TFlearn API
Train the neural network with labeled data for about 50–100 epochs
Use the pre-trained model to make predictions
Evaluate for accuracy.

A typical radio identification workflow with deep learning. Training and pre-trained models can be deployed on say an edge/gateway device.

The code for the PoC is available at — https://github.com/nihalpasham/fingerprinting_radios_w_ML. Includes scripts to

Capture, prepare, label and format IQ data-sets
Define and train a DCNN

Chosen DCNN (from the paper) includes 2 convolutional layers and 2 fully connected layers to classify 2 distinct radios

Requirements:

Any piece of low cost SDR hardware. (sub 1Ghz will do for most IoT stuff)
Any IoT edge device capable of running a deep learning model.
Robust RF data samples (i.e. samples should include everything from low to high SNR, temp variations, injected noise etc). A model is only as good as its data.

Benefits: no relying on

Higher-level authentication protocols
Or schemes involving encryption, challenge-response pairs, etc.
Or managing a database of stored credentials.
Or dealing with masquerading or impersonation attacks.

Challenges:

Accuracy drops progressively with an increase in distance or range
Performance or speed of the trained model needs some evaluation. Haven’t put the PoC through a full suite of tests.
The computational overhead for targets such as low-end edge devices. It’s just an early PoC for now, needs to be put through a full suite of tests — like in low SNR scenarios.
My RTL-SDR is a cheap 25$ dongle that doesn’t have the bandwidth resolution or the frequency range to capture/record high-frequency RF signals like BT, WiFi, etc. and struggles to sample data beyond 1 million samples/s with pyrtlsdr library.

Credits:

Deep Learning Convolutional Neural Networks for Radio Identification — http://www.ece.neu.edu/fac-ece/ioannidis/static/pdf/2018/radio_identification.pdf

RF Machine Learning Systems (RFMLS) — https://www.darpa.mil/attachments/RFMLSIndustryDaypublicreleaseapproved.pdf

Authenticating ‘low-end wireless sensors’ with deep learning + SDR

Machine Learning in Finance | Data Driven Investor

Before we cover some Machine Learning finance applications, let's first understand what Machine Learning is. Machine…

Written by Nihal Pasham