1、Embedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 2Outline Introduction to a simple digital camera Designers perspective Requirements specification Design Four implementationsEmbedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis
2、 3 Putting it all together General-purpose processor Single-purpose processor Custom Standard Memory Interfacing Knowledge applied to designing a simple digital camera General-purpose vs.single-purpose processors Partitioning of functionality among different processor typesIntroductionEmbedded Syste
3、ms Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 4Introduction to a simple digital camera Captures images Stores images in digital format No film Multiple images stored in camera Number depends on amount of memory and bits used per image Downloads images to PC Only recently
4、possible Systems-on-a-chip Multiple processors and memories on one IC High-capacity flash memory Very simple description used for example Many more features with real digital camera Variable size images,image deletion,digital stretching,zooming in and out,etc.Embedded Systems Design:A Unified Hardwa
5、re/Software Introduction,(c)2000 Vahid/Givargis 5Designers perspective Two key tasks Processing images and storing in memory When shutter pressed:Image captured Converted to digital form by charge-coupled device(CCD)Compressed and archived in internal memory Uploading images to PC Digital camera att
6、ached to PC Special software commands camera to transmit archived images seriallyEmbedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 6Charge-coupled device(CCD)Special sensor that captures an imageLight-sensitive silicon solid-state device composed of many cellsWh
7、en exposed to light,each cell becomes electrically charged.This charge can then be converted to a 8-bit value where 0 represents no exposure while 255 represents very intense exposure of that cell to light.Some of the columns are covered with a black strip of paint.The light-intensity of these pixel
8、s is used for zero-bias adjustments of all the cells.The electromechanical shutter is activated to expose the cells to light for a brief moment.The electronic circuitry,when commanded,discharges the cells,activates the electromechanical shutter,and then reads the 8-bit charge value of each cell.Thes
9、e values can be clocked out of the CCD by external logic through a standard parallel bus interface.Lens areaPixel columnsCovered columnsElectronic circuitryElectro-mechanical shutterPixel rowsEmbedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 7Zero-bias errorManu
10、facturing errors cause cells to measure slightly above or below actual light intensityError typically same across columns,but different across rowsSome of left most columns blocked by black paint to detect zero-bias error Reading of other than 0 in blocked cells is zero-bias error Each row is correc
11、ted by subtracting the average error found in blocked cells for that row123157142127131102992351341351571121091061081361351441591081121181091261761831611111861301321331371491541261851461311321211301271462051501301261171511601812501611341251681701711781831791121241361701551401441151122481214145146168
12、1231201171191471210144153168117121127118135991761831611111861301321330014415616113319215313813977122131128147206151131127201211551641852541651381294417317517618318818411712955Covered cellsBefore zero-bias adjustmentAfter zero-bias adjustment-13-11-90-7-1-4-5Zero-bias adjustmentEmbedded Systems Desig
13、n:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 8Compression Store more images Transmit image to PC in less time JPEG(Joint Photographic Experts Group)Popular standard format for representing digital images in a compressed form Provides for a number of different modes of operation
14、Mode used in this chapter provides high compression ratios using DCT(discrete cosine transform)Image data divided into blocks of 8 x 8 pixels 3 steps performed on each block DCT Quantization Huffman encodingEmbedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 9DCT
15、step Transforms original 8 x 8 block into a cosine-frequency domain Upper-left corner values represent more of the essence of the image Lower-right corner values represent finer details Can reduce precision of these values and retain reasonable image quality FDCT(Forward DCT)formula C(h)=if(h=0)then
16、 1/sqrt(2)else 1.0 Auxiliary function used in main function F(u,v)F(u,v)=x C(u)x C(v)x=0.7 y=0.7 Dxy x cos(2u+1)u/16)x cos(2y+1)v/16)Gives encoded pixel at row u,column v Dxy is original pixel value at row x,column y IDCT(Inverse DCT)Reverses process to obtain original block(not needed for this desi
17、gn)Embedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 10Quantization step Achieve high compression ratio by reducing image quality Reduce bit precision of encoded data Fewer bits needed for encoding One way is to divide all values by a factor of 2 Simple right sh
18、ifts can do this Dequantization would reverse process for decompression115039-43-1026-831141-81-3115-73-6-222-514-111-4226-317-382-61-13-1236-23-185441337-410-217-836-11-9-420-28-2114-19-721-63312-21-5-13-11-17-4-17-41445-5-13-1015-10014-9-103-12-10-5302-50-8-2-25-3-21625-11-31-15-1-1-13-4-32-2-13-1
19、002-3-1-2-1-2-101-1After being decoded using DCTAfter quantizationDivide each cells value by 8Embedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 11 Serialize 8 x 8 block of pixels Values are converted into single list using zigzag pattern Perform Huffman encoding
20、 More frequently occurring pixels assigned short binary code Longer binary codes left for less frequently occurring pixels Each pixel in serial list converted to Huffman encoded values Much shorter list,thus compressionHuffman encoding step Embedded Systems Design:A Unified Hardware/Software Introdu
21、ction,(c)2000 Vahid/Givargis 12Huffman encoding examplePixel frequencies on leftPixel value 1 occurs 15 timesPixel value 14 occurs 1 timeBuild Huffman tree from bottom upCreate one leaf node for each pixel value and assign frequency as nodes valueCreate an internal node by joining any two nodes whos
22、e sum is a minimal valueThis sum is internal nodes valueRepeat until complete binary treeTraverse tree from root to leaf to obtain binary code for leafs pixel valueAppend 0 for left traversal,1 for right traversalHuffman encoding is reversibleNo code is a prefix of another code14453210-2-1-10-5-3-4-
23、8-96141121121224354659510511514617818152935641-115x 08x-26x15x25x35x55x-34x-53x-102x1441x-91x-81x-41x61x141x-1000100-21101010211103101050110-311110-510110-1001110144111111-9111110-8101111-4101110601111114011110Pixel frequenciesHuffman treeHuffman codesEmbedded Systems Design:A Unified Hardware/Softw
24、are Introduction,(c)2000 Vahid/Givargis 13Archive step Record starting address and image size Can use linked list One possible way to archive images If max number of images archived is N:Set aside memory for N addresses and N image-size variables Keep a counter for location of next available address
25、 Initialize addresses and image-size variables to 0 Set global memory address to N x 4 Assuming addresses,image-size variables occupy N x 4 bytes First image archived starting at address N x 4 Global memory address updated to N x 4+(compressed image size)Memory requirement based on N,image size,and
26、average compression ratioEmbedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 14Uploading to PC When connected to PC and upload command received Read images from memory Transmit serially using UART While transmitting Reset pointers,image-size variables and global m
27、emory pointer accordinglyEmbedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 15Requirements Specification Systems requirements what system should do Nonfunctional requirements Constraints on design metrics(e.g.,“should use 0.001 watt or less”)Functional requiremen
28、ts Systems behavior(e.g.,“output X should be input Y times 2”)Initial specification may be very general and come from marketing dept.E.g.,short document detailing market need for a low-end digital camera that:captures and stores at least 50 low-res images and uploads to PC,costs around$100 with sing
29、le medium-size IC costing less that$25,has long as possible battery life,has expected sales volume of 200,000 if market entry 6 months,100,000 if between 6 and 12 months,insignificant sales beyond 12 monthsEmbedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 16Nonf
30、unctional requirements Design metrics of importance based on initial specification Performance:time required to process image Size:number of elementary logic gates(2-input NAND gate)in IC Power:measure of avg.electrical energy consumed while processing Energy:battery lifetime(power x time)Constraine
31、d metrics Values must be below(sometimes above)certain threshold Optimization metrics Improved as much as possible to improve product Metric can be both constrained and optimizationEmbedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 17Nonfunctional requirements(co
32、nt.)PerformanceMust process image fast enough to be useful1 sec reasonable constraintSlower would be annoyingFaster not necessary for low-end of marketTherefore,constrained metricSizeMust use IC that fits in reasonably sized cameraConstrained and optimization metricConstraint may be 200,000 gates,bu
33、t smaller would be cheaperPowerMust operate below certain temperature(cooling fan not possible)Therefore,constrained metricEnergyReducing power or time reduces energyOptimized metric:want battery to last as long as possibleEmbedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahi
34、d/Givargis 18Informal functional specificationFlowchart breaks functionality down into simpler functionsEach functions details could then be described in English Done earlier in chapterLow quality image has resolution of 64 x 64Mapping functions to a particular processor type not done at this stages
35、erial outpute.g.,011010.yesnoCCDinputZero-bias adjustDCTQuantizeArchive in memoryMore 88 blocks?Transmit seriallyyesnoDone?Embedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 19Refined functional specificationRefine informal specification into one that can actuall
36、y be executedCan use C/C+code to describe each function Called system-level model,prototype,or simply model Also is first implementationCan provide insight into operations of system Profiling can find computationally intensive functionsCan obtain sample output used to verify correctness of final imp
37、lementationimage file101011010110101010010101101.CCD.CCNTRL.CUART.Coutput file1010101010101010101010101010.CODEC.CCCDPP.CExecutable model of digital cameraEmbedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 20CCD moduleSimulates real CCDCcdInitialize is passed nam
38、e of image fileCcdCapture reads“image”from fileCcdPopPixel outputs pixels one at a timechar CcdPopPixel(void)char pixel;pixel=bufferrowIndexcolIndex;if(+colIndex=SZ_COL)colIndex=0;if(+rowIndex=SZ_ROW)colIndex=-1;rowIndex=-1;return pixel;#include#define SZ_ROW 64#define SZ_COL (64+2)static FILE*image
39、FileHandle;static char bufferSZ_ROWSZ_COL;static unsigned rowIndex,colIndex;void CcdInitialize(const char*imageFileName)imageFileHandle=fopen(imageFileName,r);rowIndex=-1;colIndex=-1;void CcdCapture(void)int pixel;rewind(imageFileHandle);for(rowIndex=0;rowIndexSZ_ROW;rowIndex+)for(colIndex=0;colInde
40、xSZ_COL;colIndex+)if(fscanf(imageFileHandle,%i,&pixel)=1)bufferrowIndexcolIndex=(char)pixel;rowIndex=0;colIndex=0;Embedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 21CCDPP(CCD PreProcessing)modulePerforms zero-bias adjustmentCcdppCapture uses CcdCapture and CcdP
41、opPixel to obtain imagePerforms zero-bias adjustment after each row read in#define SZ_ROW 64#define SZ_COL 64static char bufferSZ_ROWSZ_COL;static unsigned rowIndex,colIndex;void CcdppInitialize()rowIndex=-1;colIndex=-1;void CcdppCapture(void)char bias;CcdCapture();for(rowIndex=0;rowIndexSZ_ROW;rowI
42、ndex+)for(colIndex=0;colIndexSZ_COL;colIndex+)bufferrowIndexcolIndex=CcdPopPixel();bias=(CcdPopPixel()+CcdPopPixel()/2;for(colIndex=0;colIndexSZ_COL;colIndex+)bufferrowIndexcolIndex-=bias;rowIndex=0;colIndex=0;char CcdppPopPixel(void)char pixel;pixel=bufferrowIndexcolIndex;if(+colIndex=SZ_COL)colInd
43、ex=0;if(+rowIndex=SZ_ROW)colIndex=-1;rowIndex=-1;return pixel;Embedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 22UART moduleActually a half UARTOnly transmits,does not receiveUartInitialize is passed name of file to output toUartSend transmits(writes to output
44、file)bytes at a time#include static FILE*outputFileHandle;void UartInitialize(const char*outputFileName)outputFileHandle=fopen(outputFileName,w);void UartSend(char d)fprintf(outputFileHandle,%in,(int)d);Embedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 23CODEC m
45、oduleModels FDCT encodingibuffer holds original 8 x 8 blockobuffer holds encoded 8 x 8 blockCodecPushPixel called 64 times to fill ibuffer with original blockCodecDoFdct called once to transform 8 x 8 block Explained in next slideCodecPopPixel called 64 times to retrieve encoded block from obufferst
46、atic short ibuffer88,obuffer88,idx;void CodecInitialize(void)idx=0;void CodecDoFdct(void)int x,y;for(x=0;x8;x+)for(y=0;y8;y+)obufferxy=FDCT(x,y,ibuffer);idx=0;void CodecPushPixel(short p)if(idx=64)idx=0;ibufferidx/8idx%8=p;idx+;short CodecPopPixel(void)short p;if(idx=64)idx=0;p=obufferidx/8idx%8;idx
47、+;return p;Embedded Systems Design:A Unified Hardware/Software Introduction,(c)2000 Vahid/Givargis 24CODEC(cont.)Implementing FDCT formula C(h)=if(h=0)then 1/sqrt(2)else 1.0F(u,v)=x C(u)x C(v)x=0.7 y=0.7 Dxy x cos(2u+1)u/16)x cos(2y+1)v/16)Only 64 possible inputs to COS,so table can be used to save
48、performance timeFloating-point values multiplied by 32,678 and rounded to nearest integer32,678 chosen in order to store each value in 2 bytes of memoryFixed-point representation explained more laterFDCT unrolls inner loop of summation,implements outer summation as two consecutive for loopsstatic co
49、nst short COS_TABLE88=32768,32138,30273,27245,23170,18204,12539,6392,32768,27245,12539,-6392,-23170,-32138,-30273,-18204,32768,18204,-12539,-32138,-23170,6392,30273,27245,32768,6392,-30273,-18204,23170,27245,-12539,-32138,32768,-6392,-30273,18204,23170,-27245,-12539,32138,32768,-18204,-12539,32138,-
50、23170,-6392,30273,-27245,32768,-27245,12539,6392,-23170,32138,-30273,18204,32768,-32138,30273,-27245,23170,-18204,12539,-6392;static int FDCT(int u,int v,short img88)double s8,r=0;int x;for(x=0;x8;x+)sx=imgx0*COS(0,v)+imgx1*COS(1,v)+imgx2*COS(2,v)+imgx3*COS(3,v)+imgx4*COS(4,v)+imgx5*COS(5,v)+imgx6*C