One of the most important decisions that you will make when starting a new camera design is with regard to the type of processor choosen for sensor control. If you have the option or budget for being able to throw an OMAP or i.MX processor at your design, do it- you have gone a long way toward completion. Other applications require either razor-thin BOMs, or the necessity to utilize sensors that may not match these CPUs input capabilities. In these cases, a different architectural approach must be used. Unfortunately, very few micros are fast enough to directly control the transfer of data that occurs once capture begins. Now, decisions must be made as to how the control system is going to work, and more importantly, if manipulation of the data is required as sensor data 'broadsides' a system. At best, sensor timing allows very little time for CPU intervention during a capture process. Image synchronization takes many forms, but direct CPU control must be carefully thought out- if not outright avoided- if any degree of capture speed is required. To help alleviate this limitation, there are some pretty cool tricks that can be utilized such as on-the-fly LUTs and multi-tap FPGA paths that can be designed into a data-path to help avoid imaging latency. Regardless of what approach is taken, a careful analysis must be made prior to putting down the first IC. Problems I have been called in to fix included
Architectural designs that were too complicated
for the task at hand.
KISS,
(Keep It Simple, Stupid) is still the best possible solution
to a challenging system design. Minimize data paths and ICs whenever
possible. If you can use one FPGA to eliminate 12 discretes, do it.
Your manufacturer will love you.
Not
planning for easy test of the system -up front- for
when the prototype PCBs are in your hands.
This one is REALLY, REALLY, REALLY important. Unless you have unlimited time to do simulations with expensive software like ModelPro, you are going to end up using a scope and analyzer to find out what is going on. Creating an analyzer port on complex FPGA designs is often so important that I don't even recommend trying to a big design without one. For the first spin, it only costs a connector and a number of I/O pins on the FPGA. This allows you to make Verilog or VHDL changes on the fly via JTAG and display them on your analyzer to squish bugs in near real time. That is, of course, if you ever have bugs. If you don't, I would like to work with you! I could learn something...
Next,
CREATE TEST PATTERNS. Nothing says success until you can prove that you
have good image quality, and time spent trying to determine that may be
minimized
by designing the system for checks all along the data capture chain. I
typically like to embed test images in FPGAs to programmatically
replace the sensor inputs to test the data path. A typical pattern is
shown here:

This block pattern is effective for testing on horizontal / vertical synchronization as well as for validating integrity of data. The different levels of gray allow me to do histogram testing to find noise and data transfer problems. Once you get a good data path under control, you have a much better shot at a successful sensor integration.
Depending on the sensor type, there are other patterns that may be highly useful, but these are pretty much application dependent.
Under-estimating system performance requirements with
respect to image capture.
Offload,
offload, offload. Using low cost FPGAs to supplant CPU
processing
can be extraordinarily effective. One of the key elements of the
panoramic camera shown elsewhere in this blog was system cost. A USB
approach for data capture was chosen to eliminate requirements for
special PCI,
PCIe
or other data capture cards. Eliminating these necked
down
the maximum camera transfer rate to about 20-25MB/sec, but
really did save system cost and
allowed more
money to be invested into the
camera- which is where the money should have been spent.
Using FPGAs for all the time
intensive tasks allowed the selecton of a control processor that cost only
$4.00. In
retrospect, the FPGAs worked so well in this application that the $4 processor could have been replaced with one
only costing $1-$2. The processor for 12c sensor setup, USB
protocol, and
general
control and housekeeping. It only had to have enough horsepower and
memory to handle a USB specification. This particular design used
the Digital
Camera / PIMA spec.
Inadequate
DMA or ISR routine speeds.
Ok, so you didn't decide on the FPGA route so you could save money, or you are not so comfortable with FPGAs. You have chosen an architecture that relies on an ISR routine to grab data from the camera and burst it to memory via some DMA routine. That is actually a good approach: if your handheld runs at > 200MHz. If you are working with low cost devices, you may have a problem. One easy way to handle data buffers is with SRAM, but nowadays, having a 5MB SRAM is out of the question. SDRAM is better, cheaper, and faster, but requires either an SDRAM controller in the CPU you are working with (out of the question for low cost systems), or, once again, an FPGA. Unfortunately, there is no magic here, and SDRAMs come with their own set of problems with respect to interface timing complexity. In the image below, the setup cycle for an SDRAM is shown right at the start of capture. This is a (single) burst capture, which has low utilization of the memory bandwidth. In single burst mode, the overhead for a single transfer is quite bad- but it is still fast for a very low-cost micro.
The
problem shown here is that from the start of capture, AND THIS IS NOT A FAST
SYSTEM, there is less than 200nS available to set up the transfer- which can tax an ISR. Worse yet, each subsequent
transfer is only 200nS behind the last one. It won't take long before
the processor gives up, and kills any concurrent tasks that may be
running.
The timing shown here is from a system where the CPU idled, and an FPGA did the heaving lifting. There were multiple paths running in parallel, giving up a 10MB transfer rate. When modified to an 8-cycle burst instead of a single cycle, the transfer rate went up to 80MB/Sec- but the USB was overwhelmed by the data. The design ended up in a 2-cycle burst, (20MB/sec), which throttled the system down to match the latencies of the USB 2.0 HS. You may end up having to play games to get the timing just right.
Next...Lack of understanding about basic architectural limitations.