DJ Party: A Collaborative Music Teacher using Atmega1284

Introduction

“This device allows friends to learn songs, create their own songs, and collaboratively merge them.”

Our final project for ECE 4760 is a fully customizable button keyboard that has a variety of features to allow for collaborative music-making between friends. Each keyboard has eight fully programmable buttons; any button can play any valid MIDI note and there are 16 customizable instrument sounds, all created using FM synthesis. Our device allows for real-time MIDI recording and playback, so that you can experiment with different harmonies and melodies. After recording a melody, you can also play it back in the exact pattern you played it using our MIDI synthesizer. The biggest collaborative feature of our device is the ability to send recorded MIDI tracks to a friend using the same device through IR transmission, such that you can hear the melody they created and superimpose your melody on top of theirs. The device is able to record up to four simultaneous channels, each of which can have its own instrument and play up to three simultaneous notes; this way, friends can send melodies back-and-forth to each other to create tone-rich songs. Lastly, we have a teacher mode, where the user plays along with preprogrammed songs. After playing it, they will receive feedback on how well the song was recreated. By using this mode, the user will be able to play the melody perfectly after some practice.

Project Overview

Motivation

The goal of our project is to allow users to enjoy music as much as we do. Both of us are musicians, and we often find it difficult to find time in our busy schedules to meet and make music. Our project allows us each musician to create his or her own part of a song, send it to a friend, and have a friend add another part at a later time. To further foster music creativity, we added a teacher mode that allows users to hear a song and attempt to play it back by ear. The prototype will give the user feedback on how well he or she was able to play the song, allowing the user to become better at the song.
Our prototype uses a variety of concepts learned in ECE 4760 and other classes. Since our project is targeted for DJing, we decided to use FM synthesis to create modern electronic sounds. The prototype uses pulse-width modulation and direct digital synthesis to play back music. To persist songs, we implemented our own MIDI library which produces valid MIDI streams and is able to playback a subset of MIDI. We wrote additional algorithms to merge MIDI tracks (for collaboration) and compare the similarity of MIDI tracks (for teaching). This library was extensively unit tested to ensure accuracy. Finally, for wirelessly transmitting tracks between prototypes, we used IR LED transmission adapted from previous work for ECE 4760. We implemented a transfer protocol based on TCP/IP to ensure reliable data transfer.

Background Math

Direct Digital Synthesis

Direct digital synthesis (DDS) uses the concept of an accumulator, an increment, and a wave table to create a digital signal. On a high level, we create a wave table, which has the values of a 2Π periodic wave from 0 to 2Π scaled to an amplitude of our choosing. Assume for simplicity that the wave table is a sine wave. As we move through the inputs to the sine table, the outputs move through the sine wave. If we make larger jumps through the table, we will output fewer samples before we reach the end of one period of the sine wave. At a constant sampling frequency, a sine wave with fewer samples will be reconstructed into a sine wave with a higher frequency. Thus, we can alter the frequency of the output sine wave by changing the interval of the jumps through the table. To move through the sine wave in software, we use an increment, which dictates how large of a jump through the sine table we move at each time tick, and an accumulator, which keeps track of where we are in the sine table. Note that time ticks happen at sampling frequency of our choice.
To make the accumulator “periodic,” we leverage software overflow: we can make the accumulator a 16-bit unsigned integer. Accumulator at 0 means that we are at sin(0) and accumulator at 2¹⁶ – 1 (its max value before overflow) means that we are as close to sin(2Π) as can be. Once the accumulator overflows to zero, we are back at sin(0). Leveraging this fact, we can set the increment to a specific value in order to get the output sine wave to a desired frequency. Figure 2 below shows the equation we used to choose our increment value,

where f_out is the resulting frequency of the output sinusoid, F_s is the sampling frequency, which is 8 kHz in our case, I is the value of the increment, and N is the number of bits in the accumulator (16 in our case). The number of samples in an output sine wave depends on the size of the increment. For instance, if the increment is 1, then it will take 2¹⁶ – 1 samples before the accumulator overflows and starts the next output sine wave. To ensure that you have at least S samples per output sine wave, the increment must be restricted to < 2^{16 – log₂S} bits, which means that the output frequency is restricted to F_s / S.
Note that triangle and sawtooth wave can be constructed the same way. If the 2Π periodic sine wave is replaced with any other 2Π periodic wave, direct digital synthesis will output that waveform instead.

FM Synthesis

We chose to use FM synthesis to create different sounds. Since our project is aimed at DJs, who are accustomed to using button boards to create music, we were able to leverage the many electronic-sounding instruments that FM synthesis can produce. The FM synthesis equation we use is:

Figure 3: FM Synthesis Formula

where functions f and g may be 2Π-periodic sine, sawtooth, or triangle waves. Note that the waveform has two components: a carrier frequency f_c, and a modulation component, f_m. The modulation component provides interesting, electronic-like sound. Users can select different “instruments” that change the functions represented by f and g and modify the values of D, τ_d.

High-Level Block Diagram

1. Command Line Interface

The user issues commands to the prototype via a PuTTY terminal on a Windows computer. The computer is connected to the MCU via a USB cable into the MCU’s USART0. The characters from the command are sent serially via standard USB communication protocol. The user is given a list of commands to issue to the prototype, such as “record” or “learn” and enters them into the PuTTY terminal. The user receives his or her feedback from teacher mode in the terminal.

2. Speakers

We use direct digital synthesis to create sound, so the output of the MCU’s pulse-width modulator (port B.3) is low-passed and connected to standard computer speakers via alligator clips.

3. Push-button Keyboard

The user plays music via a push-button keyboard, which is an array of active-low buttons: each button is an short circuit when pressed and an open circuit when released. We enabled the pull-up resistors on port C on the MCU and wired each button to a port C pin.

4. IR LED Transceiver

We use USART1 on the MCU to serially transmit and receive MIDI streams from other prototypes. We transmit the streams via packets, and each packet is transmitted using a protocol adapted from ECE 4760 Lab 3. Since a typical MIDI stream consists of many packets, we use an algorithm based on TCP/IP to resend dropped packets.

Design Decisions

Microcontroller

The primary design constraint that we encountered was internal SRAM size. Since we chose to use an SMPTE-time type MIDI, each stream would require 2-bytes for each 10 milliseconds of track time. If we include additional header overhead and other MIDI events in the stream, even 16 kB of RAM proves small. Thus, we decided to use the AtMega 1284p, since it had the largest RAM capacity of the AtMega series.

Transceiver

We weighed a variety of factors when choosing IR LED as the medium for transmitting and receiving MIDI data: cost, required research, and reliability. The low cost of 2 IR LEDs and 1 IR receiver per prototype made an IR LED transceiver especially appealing. Since we needed to build two prototypes for demo, low-cost was paramount. Second, since we had already used IR LED transmission in a previous lab, we were very familiar with how it worked and what we would need to do to get it working on our MCU. Finally, testing performed by Bruce Land revealed that in a 3-meter range, the packet loss rate for the Lab 3 IR transmission configuration was less than 0.1%.

Sound Encoding

We chose MIDI to encode songs because it’s highly compressed and universally used. Choosing an industry standard gave us confidence that we would not have any unnoticed edge cases in a custom encoding, and our tests for the library could be decoupled from the integrity of the encoding. Using an industry standard also ensures portability if we wish to interface future version of our prototype with other MIDI devices. Finally, MIDI files are much more compact than even the most popular compressed audio formats such as MP3. Since the AtMega 1284p only has 16 kB of RAM, this is crucial to allow us to use on-chip RAM exclusively.

MIDI Time Scale

MIDI has two ways to express time intervals in the encoding: relative time and absolute time. Relative time is expressed in a time signature, which has units beats per minute (BPM). A status byte is encoded into the MIDI file 24 times per quarter note. The benefit of relative time is that you can change the tempo of a MIDI song by only changing the time signature status byte–the rest of the file needs not be changed. Absolute time has no concept of time signature; rather, it issues a 2-byte time tick called a quarter frame message, which is issued 4 time for frame, or every 10 milliseconds in our prototype. Although tempo is not easily changed, recording user input is much easier, as the prototype has no way of knowing what tempo the user desires. Even if he or she inputs a tempo, it is unlikely that the user will be able to hold a constant tempo (it is very hard!). Thus, playback would inevitably differ from recorded input. For this reason, we chose to use absolute time.

Standards

International standard “IEC (EN DIN) 60825-1 SAFETY OF LASER PRODUCTS – Part 1” restricts laser products emitting coherent laser radiation. The TSAL6400 does not violate this standard according to Vishay’s documentation on eye safety with respect to IR LEDs. Our generated MIDI streams comply with MIDI standards.

Copyrights and Patents

Our project does not have any relevant copyrights, patents, or trademarks associated with it. Our software uses open-source protocols such as MIDI and borrows from open-source protocols such as TCP/IP. We could find no relevant patents for MIDI sequencers or synthesizers that resembled our software implementations, and the teaching algorithm that we wrote uses bounded linear search, which is not patented. All other software is borrowed from ECE 4760 labs; we are free to use it, but we cannot patent it. Our IR transceiver hardware is borrowed from ECE 4760 Lab 3, so we are also free to use it but unable to patent it. Our button keyboard is a simple array of active-low buttons, which is not patented and likely too simple to patent.

Hardware Design & Implementation

For our project, we used hardware to implement many of the core functions of our project. To be specific, we used a push-button keyboard that would generate sounds when pressed, a low-pass filter on the output of the PWM to actually produce the sounds we wanted, IR LEDs and receivers to be able to wirelessly transmit MIDI files from board to board, and finally, we used USART communication with a computer to select what mode we wanted to enter. See the “Appendices” section for complete circuit schematics.

Push-Button Keyboard

The push-button keyboard is the main interface for a user of our project. They can play melodies, experiment with sounds, and add onto existing MIDI files. The keyboard is simply push buttons that were soldered onto a board and connected to the eight ports of Port C. The pull-up resistors of port C are enabled to ensure the pins are never floating. Thus, when the button is pushed, the pin is shorted to ground, and when the button is released, the pin is pulled up to V_cc. Thus, the buttons are active low. We connected one button to one port because we wanted to be able to play several notes at the same time, so we need one bit for each button to ensure any possible combination of buttons can be decoded.

Low-pass Reconstruction Filter

To produce the sounds we wanted, we put the output of the PWM into a low-pass filter and passed the resulting wave into the speakers. We used the same resistor and capacitor values from Lab 2 (20kΩ resistor and 10nF capacitor), which also used DDS to create sound. These values gave us a cut-off frequency of 7.2 kHz, sufficient to filter out the 62.5kHz carrier frequency from the output of the PWM, while passing and reconstructing all frequencies we can produce. Although our filter is able to pass C8, the MIDI note with the highest frequency of 4186 Hz, the frequency of our Timer1 interrupt for direct digital synthesis limits our highest note to B7, which is 3951 Hz (see: Direct Digital Synthesis in Software).

IR Transmission

We used the set-up of Lab 3 for our IR transmission, including the TSOP31456 receiver and TSAL6400 IR LEDs. We needed to send a few kB of data each time, so to ensure reliable transmission, we created a finite-state machine similar to TCP/IP. This is discussed further in the software section. We used a baud rate of 4,800 Hz and each character sent was seven bits with two stop bits. We picked a data payload of 64 bytes; larger payloads are more efficient, but a payload too large would decrease reliability. We decided that 64 bytes balances both trade-offs well. We used a 100Ω resistor for the IR LED circuit such that the current through the IR LEDs was 25 mA (current = 2.5V / 100Ω). Circuit diagrams of the IR LED and receiver can be seen in Figures 5 and 6.

USART Transmission

We interfaced our PuTTY terminal and the MCU using the MCU’s USART0. One of our prototype boards had a presoldered USB input for USART communication; we added a CP2102 Breakout to the other prototype. We chose to use USB and USART communication between the MCU and PuTTY because it is inexpensive and reliable. We interfaced our IR transceiver with the MCU using the MCU’s USART1. We needed to change our serial frame format for the MCU’s USART1 based on the limitations of our IR receiver hardware. According to the TSOP34156 datasheet, “after each burst which is between 6 cycles and 70 cycles a gap time of at least 10 cycles is necessary.” Since the TSOP34156 transmits data directly to our USART, this means that our USART serial frames must maintain this ratio. To do this, we chose to have 7 data bits and 2 stop bits per serial frame, putting our serial transmission well within this hardware limitation.

Hardware Limitations

A challenge we faced was the restrictions hardware imposed on us. We initially wanted to use RC5 encryption for our IR communication, but after doing research and looking at sample code online, we realized that sending RC5 messages was extremely complicated and depended on very accurate timing. In this case, both the hardware and software limited our ability to implement this method of communication.

Another example is the button keypad we used; originally we wanted to use the 16-button telephone keypad used in Lab 2 to interface with the user and create tones when pressed. However, we soon realized that it was impossible to detect the difference between two or three (or more) button presses if the buttons were all in the same column and row; this is due to the matrix nature of the keypad, since the third button press would connect wires the first two buttons already connected. Since we wanted to be able to play at least three buttons at a time, this keypad was unusable for our design; as a result, we created the keypad we used in the end. Another similar limitation was using just eight buttons on the keypad we used instead of the 16 we wanted; the reason for this was the number of ports on the MCU. We needed to connect each button to a different port, but we did not have enough unused ports to implement this.
Lastly, we wanted to have the ability to play longer songs, so we planned on using DataFlash for more memory. However, after looking at sample code, we realized that this was not a easy task and the benefit of longer songs was not worth the time to implement it. Additionally, we mixed up the order for the DataFlash part, so it did not come in until we were done with the rest of our project. As a result, we did not implement DataFlash to hold longer recorded songs.

Parts List:

Electronics	White Board	ECE 4760 Lab	$6.00	2	$12.00
	AtMega1284p Target Board	ECE 4760 Lab	$5.00	2	$10.00
	Logitech S-120 PC Multimedia Speakers	ECE 4760 Lab	$10.00	2	$20.00
	Power Supply	ECE 4760 Lab	$5.00	2	$10.00
	Solder Board (6 inch)	ECE 4760 Lab	$2.50	2	$5.0
	12mm Round Pushbutton	ECE 4760 Lab	$0.35	16	$5.60
	Alligator Clips	ECE 4760 Lab	$0.50	2	$1.00
	USB Cable, Type A Male, Type B Male	ECE 4760 Lab	$4.00	2	$8.00
	TSOP34156 IR Receiver	ECE 4760 Lab	$1.20	2	$2.40
	TSAL6400 IR LED	ECE 4760 Lab	$0.55	4	$2.20
	20 kΩ Resistor	ECE 4760 Lab	$0.00	2	$0.00
	100 Ω Resistor	ECE 4760 Lab	$0.00	2	$0.00
	500 kΩ Resistor	ECE 4760 Lab	$0.00	2	$0.00
	330 kΩ Resistor	ECE 4760 Lab	$0.00	16	$0.00
	10 nF Capacitor	ECE 4760 Lab	$0.00	2	$0.00
	100 μF Electrolytic Capacitor	ECE 4760 Lab	$0.00	2	$0.00
				Total	$76.20

For more detail: DJ Party: A Collaborative Music Teacher using Atmega1284