Introduction
Current solutions to these kinds of βannouncing systemβ are limited by the OTP (One-Time Programmable) voice chip with small capacity (normally use EPROM as storage media), not to mention the relatively expensive price (the price of OTP voice IC is determined by itβs capacity β voice recording time, normally 10s~200s). For instance, the OTP chip AP89010 (manufactured by APlus) is $0.48 (10 seconds OTP), and AP89341 is $2.43 (150 seconds OTP), and one announcing system may use multiple voice chips (2-10pcs) which also results in the complexity of hardware design. On the contrary, the price of standard SD card is going cheaper with the rapid development of storage technology, 512MB SD card (Kingston) only needs $3.65. Compared to OTP chip, 512MB storage can hold as many as 128 songs with MP3 format, can contain 10 lossless songs (WAV format), and the key point it that SD card can be easily formatted, songs or recorded files can be modified at will β as long as you have a PC supports FAT filesystem.
As mentioned above, the announcing system using OTP chip solution has a complicate circuit β different audio announcements need to be played from different chips, the control logic is quite a burden which definitely increase the circuit complexity including the PCB area. Quite the opposite, our system only need 6 main wires (SPI bus: 4 wires, PWM output channel: 2 wires) to play the songs or any recorded wave files. AVR chip can be purchased as cheap as $4 (ATmega328). So the whole system can be very cost-effective.
High Level Design
Project Idea
There are many MCU-Based MP3 player, however, these kinds of MP3 player solution not only needs extra hardware decoder, but also needs DAC chip, and the circuit is complicate. How to design a chip music player while reserve the audio quality? To substitute the MP3 decoder, we can simply use WAVE file which format is simpler than MP3, and it can be easily decoded by MCU. To substitute the DAC chip, PWM can be used to replace it with simple R-C filter, such as the Cricket Call Generator. Based on this kind of idea, WAVE player based on AVR wais born.
Background Knowledge
Since SD card is used as the storage media, except SDIO, the only way to commuinicate with it is SPI. Generally, SPI has four working modes.
To operate SD card using SPI, the working mode must be specified. In βSD Specifications Part 1 Physical Layer Simplified Specificationβ, Section 7.2 mentions part of the SPI protocol, but doesnβt specify any details. After several experiments, the suitable SPI working modes for SD card communication is MODE0 and MODE1.
Meanwhile, instead of reading RAW data from SD card, FAT16/32 file system is implented, so that WAVE files can be easily stored into SD card via any PC with SD card reader socket. The FAT file system is open source βPetit FatFsβ which is a simpler version of popular βFatFsβ developed by ChaN.
With knowledges above, I still needs to get familiar with the format of WAVE file which is actually a subset of βRIFFβ format developed by Microsoft. RIFF audio format will be introduced in detail in software section.
Logical Structure
At a high level, the logic structure is much simpler. It contains βStorage Moduleβ and βPWM-DAC Module. With embedded hardware SPI interface and PetitFatFs module, WAVE files stored in SD card can be retrieved. Then ATmega1284 will parse the WAVE file and get the pure audio data which is used as the PWM output (4 channels). Four 8bit PWMs are divided into two groups (corresponding to audio left channel and right channel), and combined separately to two 16bit βPWM-DACβ outputs. These two output channels are finally connected to the loudspeaker.
User Experience (UE), Hardware (HW) and Software (SW) Trade-offs
There always exists trade-offs in UE, HW and SW. This project actually shows how to appropriately handle these trade-offs appropriately.
UE and SW
There exists two methods to retrieve audio data from SD card. We can write RAW audio data into SD card with many 3rd-party software tools. In this way, the software design is pretty easy: all MCU needs to do is just read audio data via SPI interface, and forward the audio data stream to PWM channels. However, it will be very annoying to replace audio data in SD card.
The other method is to use standard file system so that SD card can be operated normally in any PC with FAT support. With this method, it is very convenient to replace / update audio files, the difficulty is that a lightweight File System must be implemented so that MCU is able to retrieve data via standard FAT16/32.
HW and SW
MP3 is a popular audio format with smaller file size and good audio quality. If we use MP3 files as the audio source, then a hardware decoder must be implemented. ATmega1284 or any other low-cost 8-bit MCU is not able to guarantee the playback quality since it needs to parse the complicate compressed MP3 file. On the contrary, the format of WAVE file is easier: it is composed by a simple file header followed by RAW audio data which can be directly output to PWM channels (16-bit audio data needs simple subtraction operation).
Another advantage to use WAVE file is that it is lossless! As long as we can guarantee the playback quality (output audio data in exactly the same frequency as sample rate), theoretically speaking, we can achieve very high audio quality. Even though WAVE file is much bigger than MP3 file, but it would not be a problem: SD card is so cheap these days.
Standards
- SD Specifications Part 1 Physical Layer Simplified Specification (version 4.10);
- Microsoft Extensible Firmware Initiative FAT32 File System Specification (version 1.03);
- Microsoft New Multimedia Data Types and Data Techniques (version 3.0);
- ANSI-C Specification (ISO / IEC 9899).
- GCC Manual (version 4.9.2)
Relevant Copyrights
Petit FatFs system is a lightweight Fat File System which supports FAT12/16/32, and it is developed by ChaN. Petit FatFs is specially designed for those MCUs with low RAM capacity.
Hardware
Hardware Overview
As mentioned before, the hardware is pretty simple. The whole schematic is shown below.
Hardware Design: SD Card SPI Interface
As mentioned in schematic note. SD card can only accept 3.3V operating voltage level. Even though ATmega1284p can also work at 3.3V, it may not be able to work at 16MHz (βAtmel-8272-8-bit-AVR-microcontroller-ATmega164A_PA-324A_PA-644A_PA-1284_P_datasheetβ, pp 336, Figure 28-1). To ensure system stability, ATmega1284p is powered by 5V while SD card is powered by 3.3V with a cheap, linear regulator AMS1117-3V3 (not shown in schematic).
To interact with SD card correctly, resistor dividers (3.3K and 1.8K) are used to guarantee AVRβs SPI signal level would never exceed 3.3V. Note that, for MISO signal, 3.3V is acceptable by AVR, so it does not need extra resistor divider.
Hardware Design: PWM Combination
Single PWM output can use one output pin to generate analog signals with 8-bit resolution (as we learned in LAB2). However, we can simply create another analog signal via another PWM output channel. Then we can make this second analog signal represent lower order bits via selecting correct summing ratio.
In this project, the PWM Combination circuit is composed of two resitors: 1M and 3.9K. And the summing ratio is about 8:1. The larger resistor is connected to LSB PWM output channel while the smaller resistor is connected to MSB PWM channel. In this way, when MSB PWM outputs 0V while LSB PWM outputs 5V, we can calculate the minimum resolution is about 0.01942V. Assume resistor accuracy (tolerance) is 1%, then the actual output should be: 0.0191V < V < 0.0199V. The deviation range is only 800uV. It is obvious that if the resistor accuracy can be up to 0.3%, then it is able to achieve full 16-bit resolution.
Software
Software Overview
The whole software design is implemented using AVR Studio 6.2.1502 β Service Pack 1. The project architecture is designed with clear classifications: βappβ (Application) layer, βbspβ (Board Support Package) layer, βdrvβ (Driver) layer and βPetitFatFsβ (File System) layer.
Software Design: Top Level
From the perspective of top level, the whole system can be divided into 7 states, and a simple state machine can be created to guarantee the system functionality (See Figure Below). All 7 states plus one state for debugging purpose is clearly organized, but it is still required to change the low-layer Petit-FatFs interface to cooporate with this state machine. Specifically, we must change the low-layer data retrieving function in order to implement special data processing functionalities.
Software Design: Data Retrieve and Consume
The low-layer read data function is named as βdisk_readpβ. File System layer calls this function to initialize FAT, read directory, and read specified files etc. The function check the type of the SD card to see if it is required to convert the address (Logical Block Address to Block Address). Then CMD17 (READ SINGLE BLOCK) command is sent to SD card, once valid response is received (0xFE), data stream is created to forward specified data into assigned buffer. Finally, it skip the βun-alignedβ data stream (the required data volume may not be an integer multiple of 512 bytes). All of the supported SPI commands of SD card are listed inΒ Appendix C.
Since the function βdisk_readpβ needs a pointer to the target buffer, we can manipulate this pointer in order to embedd our own function. For example, if the pointer is valid, it will execute required read operation. On the contrary, if it is invalid, say it equals to 0 (NULL pointer), then it execute our specified function. In this way, we can manipulate the data flow to cooperate with our own functions without hurting the normal operation of Petit-FatFs. With this modification, the core functionality turns into buffer data flow control.
Since we can know all information by parsing the header of the wave file, we can set hardware parameters based on the parsing result, such as the configuration of βSampling TIMERβ which is used to forward audio data to PWM channels. In this way, the function of file system layer retrieve data from SD card to specified audio buffer while the βSampling TIMERβ consumes data from the audio buffer. A ring buffer is implemented to balance the Retrieve / Consume relationship. The diagram of this idea is shown below.
As it is shown in the diagram, the ISR is designed to be short and simple in order to guarantee the accurate interrupt interval.
Software Design: Wave File Parsing
As mentioned above. The format of WAVE file is pretty easy to parse. As a subset of RIFF, it has exactly the same structure defined in βMicrosoft New Multimedia Data Types and Data Techniques (version 3.0)β. According to this specification, the RIFF WAVE file format is shown as below.
Based on this format, we can design appropriate software to parse the wave file and get essential information we actually need. Note that the total bytes of βChunk IDβ and βChunk Sizeβ field is always 8, and considerring that the upper layer function βpf_readβ will internally increment the data pointer, we can actually fetch 8 bytes every time, and do further process based on βChunk IDβ information (This kind of coding style is called FCC β Four Character Coding). Note that the information is stored in βBig Endianβ format, we can either convert the original data or convert normalized ID for comparison. For instance, we can load four bytes directly since it is efficient, and compared this word to reversed Chunk ID. This file parsing information is implemented in the function βApp_Wave_ParseHeaderβ. The software diagram of this function is shown below. Please note that, the judgements of βLISTβ (play list) chunk, βDISPβ (display) chunk, and βfactβ chunk are also added to guarantee compatibility, and we just need to skip these chunks if it happens. Meanwhile, for both debugging purpose and software robustness, different return values are defined. Based on the audio information we get from the audio file, we are able to check whether the file is corrupted or not and return corresponding error information.
Now that we know the format of header file, we still need to know the audio data arrangement of different WAVE files. It is known that there exists four kinds of WAVE files: 8-bit Mono, 8-bit Stereo, 16-bit Mono and 16-bit Stereo. Only with this understanding, can we appropriately design efficient buffer structure and data retrieving functions. Simple code in ISR means extra work needs to be done outside the ISR. Based on RIFF file format of WAVE file, the data arrangement is shown below.
Software Design: Audio Buffer
Based on the audio data distribution of different wave files, four independent buffers are created according to the format of 16-bit Stereo Wave File Format, since it needs 4 bytes per sample. All buffers are of the same size: 256, and equipped with the same βHeadβ and βTailβ 8-bit pointer. With 8-bit pointers, ring buffer operations becomes simpler: simply increment head or tail pointer without checking buffer size limitations, since it will turn into zero once its value exceeds 255. Another reason is that 8-bit addition is faster that 16-bit addition. It is better to design 4 independent buffers rather than creating a really large buffer.
Software Design: Preparation Before Playing
Before we play specified wave file, we need to configure the sample TIMER correctly, and assign appropriate variables to make the data process efficiently. Note that in βSoftware: Data Consume and Retrieveβ section, we use a function pointer to call specified function to forward data stream into corrsponding buffers. Based on the data arrangement of different wave files, all we need to do is to make sure that each data is put into exactly the same location as the data arragement. As for mono wave files, we just simple put the same value to buffers for left / right channels. The function pointer prototype is βtypedef void (*memProcFunc)(uint8_t)β, and a variable of βmemProcFuncβ type is defined. This function pointer is initialized when the audio information is correctly parsed from the header.
Four independent functions are implemented for different data buffer process. The audio buffer arrangement of these four functions are shown below.
Software Design: Play Wave File and Key Control
The Wave Play function is pretty simple, it firstly process the audio data after the header since the size of the data volume is less than 512 bytes. After that, this function process the remaining audio data in the unit of 1024 bytes until the size of remaining data volume is less than 1024 bytes.
The key scan subroutine is embedded into Wave Play function. To remove key glitches, a scan subroutine must be implemented. Now that we read 1024 bytes every time (except the data following the header). Based on the speed of SPI interface, we can roughly calculate the time interval between read operations. In this way, we can take advantage of this processing interval to execute key scan routine.
The software flow chart of βApp_Wave_Playβ and βApp_Ctrl_KeyScanβ is shown below.
Testing and Results
Hardware Test
The key point of hardware test is the accuracy of 16-bit Combination composed of high-precision resistors. It is pretty difficult to measure its functionality. High-Precision Muti-Meter is required to achieve rigorous validation. To logically verify this test, a test firmware is created to evaluate the hardware. This test firmware will generate PWM in Fast PWM mode, and the PWM value is incremented every 200ms. With Proteus (version 8.1 sp1) and the schematic mentioned in βHardware: PWM Combinationβ, the oscilloscope outputs is shown below. Note that, the horizontal scale is adjusted until the βClimbing Stepsβ can be clearly observed.
From the waveform we can see that 16-bit PWM output has nearly invisible βstepsβ compared to 8-bit PWM output. However, software simulation is still not a good way for evaluation and it does need improvement.
Software Test
It is also a big problem to test the audio quality since it is very subjective. Whatβs more, human ear is not as accurate as machine. There actually exists two ways to evaluate audio quality, even though it is not so professional. One way is to record the audio data output (via LINE-IN audio port) and save the audio data to the same wave format as the original wave files. Then, use βAdobe Auditionβ to compare these two audio files. The other way is to capture the play time and compare it with original wave file.
Software Test: Sample Rate
For each wave file format, choose a different song with same sample rate (44100Hz) to implement this simple test. A simple VC program is created to capture the song start information (a simple βplay!β string) output from hardware USART interface. Once the string is captured, a TIMER is started to record elapsing time until the end information (βover!β string) is captured. The result is shown below.
16-bit TIMER1 is used as the sampling TIMER, and the value for output compare register is (16000000/44100 + 0.5 β 1 = 362), β0.5β is an important way to guarantee the precision. Even though, the sample frequency in this way is 44077.135 Hz. Thatβs why the deviation exists.
From the result we can also see that, as the bytes per sample increase, the deviation value becomes smaller. It is because the audio data is consumed faster with 16-bit stereo audio format. We can imagine that if the sample rate of the wave file goes a bit higher, then it is very possible that current system cannot play it normally because of the limitation of SPI speed (8MHz), and CPU overclock may be essential if we want higher audio quality.
Software Test: Audio Quality Analysis
It is difficult to compare two audio files intuitively even with Adobe Audition. However, we can merge the recorded wave file with original wave file to test Mono Wave Files. For instance, with Adobe Audition, we can put original wave file to left channel and copy the recorded wave file to right channel, then Adobe Audition is able to analyze it.
For 8-bit Stereo wave files, we actually need to extract audio data from each channel and repeat the procedure mentioned above. On the other side, for 16-bit Stereo wave files, we can simply use the high-order byte as the data source for analysis. It is safe because Adobe Audition only accumulate the audio samples at different frequency points, it is not relevant with the amplitude. In this project, I just focused on 16-bit Stereo analysis. However, I was not able to analyze all four kinds of wave files due to time limitation. But this test method can help to do cross-analysis to get more information.
The FFT diagram generated by Adobe Audition is shown below. All the data of FFT analysis can be exported for further analysis.
Then the analysis result is forwarded into excel to calculate the deviation. Part of the result is shown below. From the FFT waveform we can still see some noise which causes the deviation with original data. However, if a good low-pass filter was implemented, the result should be better. Even though, the average deviation is 0.46% which proves that the audio quality is still good.
Safety
This design does not currently have electrical components that are in direct skin contact, nor do we expect high current draw beyond a few microamps. On the contrary, further implementation will add ESD protection circuit to avoidΒ circuit damage caused by electrostaticsΒ onΒ human body.
Usability
This system is pretty easy to operate with two simple keys. A further implementation will embedd a display module into the system to improve user experience.
Conclusions
Accomplishments and Further Extensions
The expected design and results for the core functionality went smoothly as planned. The system is reliable and able to process standard wave files with CD Audio Quality (Sample Rate <= 44100Hz). Even though it has no hardware filter, the audio quality is still good even with normal headset.
As mentioned in sections above, for further extensions, following ideas should be worthable to carry out.
- UE οΌUser ExperienceοΌ Enhancements
A display module together with a simple low-pass filter will be implemented to enhance operability and user experience. Considerring the size of the circuit, a small OLED module maybe used to display essential play information. - Audio Quality Enhancements
To increase audio quality, a standard 11.2896MHz OSC may be replaced to guarantee the accuracy of sample rate. Also, a well-designed low-pass filter can also help to improve audio quality. - Test Improvements
As mentioned in Test section, cross-analysis for different wave files is very useful for further research. Meanwhile, use high-precision digital voltage multimeter (FLUKE 8846A) can also stronglly prove the practicability of the PWM combination circuit to substitute 16-bit DAC.
Intellectual Property Considerations
Except the open-source Petit-FatFs developed byΒ ChaN, the rest code is my own.
Ethical Considerations
There are no known ethical considerations regarding the design of this project.
Legal Considerations
There are no known legal considerations regarding the design of this project since it is not implemented for commercial purpose. If this project was to be re-purposed for commercial purpose, then it should still be OK since the author of Petit-FatFs authorizes free usage (See Appendix).
Appendices Β Β Β top
Appendix A: Parts List and Costs
Part | Vendor | Cost/Unit | Quantity | Total Cost |
---|---|---|---|---|
Atmega 1284 | Lab Stock | $4.67 | 1 | $4.67 |
Button | Lab Stock | $0.00 | 2 | $0.00 |
High-Precision Resistors | Lab Stock & AMAZON | $0.34 | 4 | $1.36 |
SD Card (1GB) | Kingston | $4.95 | 1 | $4.95 |
USB-USART module | Personal Stock | $6.65 | 1 | $6.65 |
LT1117-3V3 | Personal Stock | $4.52 | 1 | $4.52 |
TOTAL: | $22.15 |
Appendix B: Petit-FatFs Author Claims
/*βββββββββββββββββββββββββ-/
/ Petit FatFs β FAT file system module R0.03 (C)ChaN, 2014
/ββββββββββββββββββββββββββ/
/ Petit FatFs module is a generic FAT file system module for small embedded
/ systems. This is a free software that opened for education, research and
/ commercial developments under license policy of following trems.
/
/ Copyright (C) 2014, ChaN, all right reserved.
/
/ * The Petit FatFs module is a free software and there is NO WARRANTY.
/ * No restriction on use. You can use, modify and redistribute it for
/ personal, non-profit or commercial products UNDER YOUR RESPONSIBILITY.
/ * Redistributions of source code must retain the above copyright notice.
/
/ββββββββββββββββββββββββββ/
Appendix C: SD Command List (SPI Mode)
For more detail: AVR 16bit Stereo Wave Player