Audio Modem Communication Library in Python

roman_zeyde · on June 17, 2018

Author here, thanks for posting this :)

Would be happy to answer any question.

zaarn · on June 18, 2018

How reliably does this work over FM transmission or VoIP? (If it doesn't I would guess and say that you'd have to add some mechanism to detect reliable frequency bands like in existing modems)

roman_zeyde · on June 18, 2018

Unfortunately, VoIP doesn't allow reliable OFDM communication... It's preferable to transmit and receive the audio as-is (without any compression).

ggerganov · on June 17, 2018

In your experience, how reliable is the communication between 2 devices when using speakers and microphone (no cable) at distance ~3 meters? What bandwidth do you typically achieve in such conditions?

roman_zeyde · on June 17, 2018

~3 meters is indeed quite a challenging scenario. I've tested it with the lowest bit-rate supported (1 kbps), but it doesn't work reliably.

You can try my Android app [1] to reproduce the experiments for short transmissions.

[1] https://play.google.com/store/apps/details?id=bit.zeyde.audi...

ChuckMcM · on June 17, 2018

Nicely done, I'll note for the readership here that a similar technique works fine at 300 baud across the room using FSK.

Something you might consider is to write a web camera backend for it and use the screen/webcam. That has a lot more bandwidth and has higher SNR as the selector effect of the web camera is so much better than that of a microphone.

brian-armstrong · on June 17, 2018

FSK across the room? Does that have noise restrictions?

I’ve found all the phase modulation schemes work poorly across the room. DSSS works ok but is very slow. I would be curious how reliable FSK is in that situation.

ChuckMcM · on June 17, 2018

With 1270Hz/1070Hz you've got over 3 complete cycles to identify the tone at 300 baud, at higher frequency pairs it gets better. The person who wrote it was Tom Lyon at Sun but I bet I could reproduce it with a gnuradio-companion flow graph. That would make it pretty easy for anyone to play with it.

brian-armstrong · on June 17, 2018

Interesting, I will have to give it a try then. Thanks!

darkmighty · on June 17, 2018

I don't have first hand experience but I believe a big problem with sound channels is multipath propagation/challenging frequency response. FSK naturally deals quite well with those; it's similar to OFDM but more robust due to single carried symbols. You can extend the symbol period as long as you like while using an adequate number. Finally in challenging environments it'd a good idea to use error correction, not just error detection as done here -- you can (in theory) achieve reliable transmission over arbitrarily noisy channels by adding redundancy.

ggerganov · on June 18, 2018

I have experimented with FSK modulation myself and was able to achieve quite reliable transmission across the room [1]. Although the bandwidth is quite low (8-16 B/s), it works most of the time.

[1] https://ggerganov.github.io/jekyll/update/2018/05/31/data-ov...

brian-armstrong · on June 18, 2018

Neat library!

Your post mentions not finding any reliable modem libraries. I'm curious what issues you had with mine, if you tried it?

https://quiet.github.io/quiet-js

ggerganov · on June 18, 2018

Hi, I did try it and it was working quite good when the devices (phone and laptop) were within ~30 cm. After increasing the distance and putting on the TV in the background it wasn't working that well any longer. I tried only the default protocol with audible frequencies (the ultrasound option didn't work, but my phone does not support the ultrasound option with my application too. guess it's hardware limitation).

I am willing to sacrifice bandwidth to achieve better reliability of the communication and I believe I was able to achieve it. Although, I haven't done tests with many other devices/conditions than my own, so my result might be biased.

brian-armstrong · on June 18, 2018

Interesting. What modulation does yours use?

noir_lord · on June 17, 2018

hmm, if you where using a screen/webcam, could you do something horribly crude like flash QR codes and leverage their EC?.

ChuckMcM · on June 17, 2018

Funny you should mention that, I wrote exactly that sort of link using my QR clock code (tweet embedded video of same: https://twitter.com/ChuckMcManis/status/794025248203022336)

Dowwie · on June 17, 2018

Very cool concept! Radio stations used to share software over the airwaves by using something like this.

How long would it take to transmit a 3MB jpeg and then encode it,?

roman_zeyde · on June 17, 2018

Thanks!

Using an audio cable (for better SNR), I have managed to achieve and sustain 60kbps for long periods time - so 3MB file should take ~7 minutes to transmit.

jedberg · on June 17, 2018

I’m wondering, did you build this just to see if you could, or did you have a use case in mind?

roman_zeyde · on June 17, 2018

Both :)

The initial use-case was performing air-gapped communication for signing Bitcoin transactions [1] on an offline machine.

[1] https://github.com/spesmilo/electrum/pull/964

brian-armstrong · on June 17, 2018

Do you have ideas or plans on how to increase the transmit speed?

roman_zeyde · on June 18, 2018

The original goal was transmitting ~10kB in a few seconds over an audio cable, so I didn't try to improve the transmission speed much further. I guess that adding error correction coding will allow increasing the highest possible bit-rate (currently I'm using only CRC-32 for error detection).

codezero · on June 17, 2018

Was this inspired by any of the dialup modem protocols?

roman_zeyde · on June 17, 2018

Indeed, I am using OFDM [1], which uses several carriers, each modulated via QAM [2].

[1] https://en.wikipedia.org/wiki/Orthogonal_frequency-division_...

[2] https://en.wikipedia.org/wiki/Quadrature_amplitude_modulatio...

viraptor · on June 18, 2018

I'm a bit confused. How does QAM work with audio if you have only one channel rather than I/Q? Are you using multiple frequencies for that?

roman_zeyde · on June 18, 2018

QAM is using both carrier's amplitude and phase to encode information, so I/Q can be thought as [1]:

  I = amplitude*cos(phase)
  Q = amplitude*sin(phase)

[1] http://whiteboard.ping.se/SDR/IQ

viraptor · on June 18, 2018

I'm just sitting here and staring at this sentence: "The modulated signal rides on a carrier of a given frequency, but the base band signal got no fixed frequency at all. Because of this, we have the possibility to encode the two-dimensional I/Q signal onto the one-dimensional RF signal without losing anything." I may need to sleep on it...

someguydave · on June 17, 2018

Have you tested this software on macos?

roman_zeyde · on June 17, 2018

No, I've used it only on my Linux machine.

I guess it should also work on MacOS, since it depends on `numpy` [1] and `portaudio19-dev` [2] Debian packages (which should be installable via Pip/Homebrew).

[1] http://www.numpy.org/

[2] https://people.csail.mit.edu/hubert/pyaudio/

jacob019 · on June 17, 2018

Can work above the audible spectrum?

roman_zeyde · on June 17, 2018

Currently, it does not - mainly because I've tried to use all available bandwidth (to increase the bit-rate).

Please take a look at https://github.com/quiet/quiet-js for an ultrasonic data transmission.

jaakl · on June 17, 2018

Btw, is there already a handshake and data protocol for the AI calls via voice to a business phone which happens to be received by another AI? So instead of voice call they try to negotiate using data packets and save some time and clarity.

viraptor · on June 18, 2018

This doesn't seem very useful. If you already have some agreement between between companies over how AIs communicate, you can create a registry for them and skip the handshake. Use http or something.

roman_zeyde · on June 17, 2018

Nice idea :)

However, I'm not aware of any such protocol...

yodakohl · on June 17, 2018

Awesome project. I used a similar program (minimodem[1]) a while ago to configure Wi-Fi settings via audio [2]. It's great being able to do some basic setting without having a radio connection. Amazon used the same technique in their Dash-Button albeit in the inaudible range [3].

[1] https://github.com/kamalmostafa/minimodem [2] https://www.youtube.com/watch?v=DfSHclXjobY [3] http://www.blog.jay-greco.com/wp/?p=116

roman_zeyde · on June 17, 2018

Thanks :)

brian-armstrong · on June 17, 2018

Awesome library! I feel like sound modems don’t get enough attention and carry some “old-tech” stigma from the dialup days, even though they’re fairly different.

Have you tried the Google Nearby library? Do you think that sound modems still have a place now that Bluetooth LE is more of an option?

edit: Also, do you have a browser version I can try?

roman_zeyde · on June 17, 2018

Thanks a lot, I definitely agree :)

I didn't try Google Nearby, since I was developing the modem to run on a desktop machine (but later I've created a simple Android app [1], supporting the lowest bit-rate).

Unfortunately, I don't have a working web-based version of my modem...

[1] https://play.google.com/store/apps/details?id=bit.zeyde.audi...

itodd · on June 17, 2018

This is cool and answered a recent itch. I have a pocket operator (po-32) which you can update to play new sounds by using a data over audio protocol. It's really neat. Thanks for sharing.

roman_zeyde · on June 18, 2018

Thanks, great to hear :)