Ron Horii's Tech Pages:
Scanning


Contents
Introduction
Predictions
Types of Scanners
Scan Bit Depth
Scan Performance
Measurements
Scan Resolution
Rules of Thumb
JPEG Quality vs.
Compression
Examples of JPEGs
at Different Quality Levels
My Space-Squeezing
Experience
Zooming In
Reliability Considerations
Scanner recommendations
Scanner Information
and Reviews
OCR Software
Image Editing
Software
Scanner Manufacturers
Computer Art, Graphics,
and Photography
Retailers and Distributors
1999 Update Note
Home Links
Introduction
I bought a scanner early in 1997. I thought
it might be handy, but I didn't have a clue that it would end up being
one of the most useful peripherals I've ever bought. I've come to the conclusion
that a scanner is a critical enabling technology that provides a quantum
leap in the power and usefulness of a PC. It's a key input device like
mice, keyboards, audio digitizers, and video digitizers. It feeds a portion
of the real world into a computer, where the processing and communications
power of the computer can turn the input into something useful. It fills
a critical gap in this regard. Mice, joystics, and keyboards feed user
movements and commands into the computer. Sound cards turn sound into data
and vice versa. Video digitizers and digital video cameras turn real-time
motion into computer data. One major area that's left is a gigantic one:
printed matter and artwork.
Where is the vast majority of civilized
man's knowledge and expression stored? It's not in people's minds. It's
not on tape or computer media (yet). It's mostly on paper or canvas in
libraries, galleries, museums, schools, businesses, and homes around the
world. This includes knowledge and information dating back to the beginning
of human civilization. What's the big problem with this information? It's
hard to find, and it's hard and slow to get to. You often have to physically
travel to places, sometimes far away, to find them. Even when you get there,
the material may be damaged, lost, restricted, checked out, or not what
you're really looking for. Searching for the information can be very time-consuming,
even with computerized card catalogs, because you still have to find the
actual works and manually scan through them. Because of the difficulty
of getting at this information, it limits its usefulness. Productivity
and progress is thus limited.
The computer is a tremendous tool. It enables
a user to search, access, manipulate, file, reuse, and transmit information
anywhere in the world instantly. The trick is to get the information into
it. The scanner is the link between printed information and the computer.
The scanner can take this information and turn it into a form that a computer
can work with. It opens up the computer to an enormous and virtually unlimited
source of input. With the power of the computer applied to this source
of input, the possibilities are staggering. Information is power. Combine
that computer processing power with the instantaneous and widespread capability
of disseminating information on the Internet and the Worldwide Web, and
you have a potential for empowering the average person to a degree that's
unprecdented in human history.
That's all impressive from a global standpoint,
but for the average user, the question is: what can a scanner do for me?
Here are some general uses:
-
Turning your computer into a simulated fax
machine.
-
Turning your computer and inkjet printer into
a color copier (though a very slow one).
-
Scanning documents and graphics for filing
away in computer-accessible libraries.
-
Scanning printed documents and using OCR (Optical
Character Recognition) to turn the information on the pages into data that
can be fed into work processors, spreadsheets, or databases.
-
Scanning drawings, charts, and photographs
for publishing on the Web or intranets or sending by E-mail.
Here are some specific business and scientific
uses:
-
Microscope photographs, X-rays, spectrographs,
chart recorder traces, chromatographs, or other hardcopy images, can be
scanned in and filied away with associated data. They can later be used
for reports or presentations.
-
Polaroids of writing on blackboards, hand-written
view foils or meeting charts can be scanned to preserve them and be able
to digitally access them. It's even possible to OCR them if the handwriting
is legible. The same can be done for hand-written notebook entries.
-
Hardcopy forms can be scanned to turn them
into softcopy forms. Information written on the forms can be OCR'd into
a computer.
-
Hand-drawn schematic diagrams can be scanned,
turning them into CAD schematics. The same idea can work with actual printed
circuit boards or with hand-drawn mechanical drawings.
-
Industrial design: scanning photographs of
real equipment and furniture, as well as people, using a image editing
program to move the objects around to find the optimal arrangements.
-
Security: scanning photographs, fingerprints,
and signatures of personnel for online security verification.
-
Online sales: scanning pictures of products
for advertizing on the Web.
-
Labels: scanning pictures of components or
the parts themselves to put on labels for drawers or boxes.
-
Online operating manuals: photographs of equipment
can be scanned in. Using hotspots, image maps, and help files, operators
can click on the different parts of the equipment to find out what they
do.
Here are some specific home uses:
-
Scanned personal snapshots can be used to
make personal Web pages (like
mine).
-
Family snapshots can be scanned to make greeting
cards, calendars, Christmas letters, Christmas ornaments, T-shirt iron-ons,
bookmarks, posters, signs, awards, children's crafts, and gifts.
-
Old family pictures can be scanned in and
retouched with an image editing program. This can be used to remove scratches,
dust spots, streaks, rips, and former in-laws. It can also be used to colorize
black-and-white photos or improve the contrast, sharpness, and color of
color photos.
-
Family tree albums can be created with scanned
old photographs, and with personal stories on each family member.
-
Color pictures from books and magazines can
be scanned and included in school reports, posters, or science projects,
without having to cut up the sources or make expensive color copies of
them.
-
Photographs of pets and children, as well
as fingerprints, can be scanned and stored. If they become lost, this information
can be E-mailed to the proper authorities or posted on the Web to help
locate them. Tattoes, birthmarks, or other markings on certain parts of
the body (I won't specify which) can be scanned in directly.
-
Photographs of property can be scanned in
and stored for record-keeping and insurance purposes, or used to make sales
posters or posted on the Internet for sales. Stamp, coin, trading card,
some jewelry, and small (dimension-wise) art collections can be scanned
in directly.
-
Sheet music can be scanned in, converted to
digital form and used to program and play on MIDI-compatible instruments.
If you plan an instrument, you can practice playing with an orchestra by
scanning in a whole orchestral score, editing out your part, and having
the computer play the rest, while you play along.
-
Some homework exercises require copying down
sentences and correcting them. Instead of re-typing them all in, they can
be scanned in and OCR'd, then corrected with an editor. (This should only
be done with the teacher's permission, and not with the use of spelling
or grammar checking programs.)
-
Line drawings can be scanned in and imported
into a paint program. Using a color fill tool, children can color in the
drawings like a computerized coloring book, print out their handiwork,
then E-mail a copy to grandma.
-
Children who have to stay home due to illness
can scan their homework in and E-mail it to their teachers for correction
to help stay up with their classes. Teachers similarly can scan in new
homework and E-mail it to their students at home.
-
Some people with large home libraries enter
the book titles into computerized databases to help catalog them. However,
they may forget what the books are about. If they scan in the book covers
or jackets, this can help. The same can be done for CD's, cassette tapes,
and video tapes.
-
Magazine and cookbook recipes can be scanned,
OCR'd, and filed away in recipe databases.
-
Instead of saving mountains of old magazines
or boxes of clippings, important articles can be scanned and filed, and
the magazines can be thrown out.
-
Some dry food items, such as pasta, beans,
cookies, candy, dried fruit, nuts, etc. can be put in a clear bag and scanned
to make illustrative labels for storage containers. The same can be done
for other items, such as pencils, paper clips, needles, buttons, and small
toys.
-
Instead of pressing flowers and leaves in
books to preserve them, they can be scanned in directly on a scanner and
be preserved electronically, which preserves their colors when they're
fresh.
For technical info about scanner usage, here
is one of the best online documents on scanners: A
Few Scanning Tips by Wayne Fulton. The author is a customer and user
of a Microtek E3, like me. (Reading his pages along with other reviews
was one of the reason I bought E3.) His pages have just about everything
you would want to know about scanner usage, terminology, and tradeoffs.
Predictions
I predict that scanners will soon become
so important that they will be bundled with PC's like modems, and sound
cards. A low-end scanner is about the same price as a high-end sound card,
and for office use, a scanner is much more useful than a sound card. Printers
and scanners complement each other, so it's likely they'll be bundled together
with compatible resolution.
I also predict that the huge memory requirements
of scanned images will drive the need for more powerful computers with
enhanced graphics-handling capability, more RAM, and more storage. In the
storage area, this not only includes larger hard disk drives, but high-capacity
removable storage for archiving and transporting data. Standard 3.5" 1.4MB
diskettes are inadequate for this task.
As it becomes easier to incorporate color
graphic content into documents, this will increase the demand for high-quality
color output devices, such as inkjet and color laser printers. The resolution
and quality for both scanners and printers will increase until they meet
or surpass film and high-quality printing processes.
Graphics made the World Wide Web the instant
phenomenom that it is today. Scanners put more power into the hands of
small office, academic, and home users. Scanned images will proliferate
even more on Web pages (like this one), which will increase the demand
for more bandwidth on the Web.
Scanners may displace dedicated fax machines
in some office applications where fax demands are light and a computer,
printer, and modem are already present. For multiple-page faxing, an auto-document
feeder is useful. A scanner, with the appropriate hardware and software
at the receiving end, can do something no ordinary fax machine can do:
send high-resolution color faxes.
Digital cameras will proliferate and will
displace film cameras to some degree. They won't, however, replace scanners
for a long time. The resolution, in terms of total numbers of points digitized,
of a digital camera is orders of magnitude smaller than a scanner. A high-end
digital camera, costing in the $1K+ range, will have a resolution of 1000+
points total across the entire image. A low-end scanner costing
under $100 can resolve (with interpolation) 1000+ points every fraction
of an inch. A digital camera has the advantage of portability and immediacy
and is very useful, but it's a different use from a scanner. It's like
the difference between RAM and hard disk storage. The two are complementary,
not mutually exclusive. I do believe that digital cameras will become more
popular as they get better and cheaper, but they'll displace film cameras,
not scanners.
Types of Scanners
There are several types of scanners. The type to get depends
on your application:
| Type |
Usage |
| Handheld |
Specialized applications, portable scanning
of documents for OCR. Cheap, light, but limited in size of scan and quality. |
| Sheet-Fed |
Ideal for OCR of multiple sheets of text
pages. Many have auto-document feeders. Compact, some are portable, can
be cheaper than flatbeds. Quality not quite up to the best flatbeds. Size
of scan theoretically unlimited in vertical direction. Specialized variations
include photo scanners, business card scanners, and combination keyboard-scanners. |
| Flatbed |
Most versatile scanner. Can scan sheets,
books, objects. Wide range of price, quality. Some have auto-document feeders
and slide scanners as accessories. Takes a lot of deskspace. |
| Film |
For direct scans of negatives and slides,
usually 35 mm. For professional photographic work of highest quality. Compact,
but more expensive than above types. Has widest dynamic range. |
| Drum |
Highest quality for scans of sheets. Uses
photomultiplier tube. Has highest resolution, dynamic range, and color
fidelity. Extremely expensive, graphics arts pros only. Usually owned by
service bureaus. |
Scan
Bit Depth
Scanners convert analog data
(page images) to digital data. Digital data can have different bit depths
depending on the application and scanner hardware:
-
1-bit: used for black-and-white
line drawings and text, faxes, and OCR work.
-
8-bit monochrome: used for scanning
black-and-white photographs
-
8-bit color: also called pseudo
color, not directly supported by scanners, but scan files can be converted
to this, the color depth for GIF files and MCGA video adapters.
-
16-bit color: also called high-color,
65K colors, also not supported directly by scanners, but scan files can
be converted to this.
-
24-bit color: also called true
color, for color photographs, the maximum bit resolution defined in most
common file types and the minimum bit resolution for most color scanners.
-
>24-bit color: 30, 36 bits are
common. A 10-bit or 12-bit analog-to-digital converter are used for each
color (red, green, and blue). There are few common file formats that support
these bit depths (TIFF is the only one I know of), but the extra bits are
used to provide extra dynamic range for the scans, especially in resolving
detail in dark areas and highlights. For high-quality photographic work,
the extra bits are worth it. For more information, see "Will
30-bit and 36-bit Scanners Give Better Scanned Images? "

Scan
Performance Measurements 
I ran a test on my scanner at
home to see how long and how much memory it takes to scan at different
resolutions and in color vs. black & white. Resolution is measured
in SPI (Samples Per Inch) or DPI (Dots Per Inch). I measured scan time
from the time I hit the button to start the scan until the scanned image
started to appear on the video screen.
Conditions:
Scanner: Microtek Scanmaker
E3
Maximum resolution: 300
X 300 DPI optical resolution, 2400 X 2400 interpolated resolution
Interface: SCSI
Computer: Pentium 120 CPU
Memory: 49 MB RAM, 128K
pipeline burst cache
Scan size: 8 1/2" X 11"
Scan Size and Time Measurements
Table
| Scan Type |
Resolution |
File Size |
Scan Time |
| 1-bit B&W |
75 DPI |
258 KB |
13 secs |
| 1-bit B&W |
100 DPI |
458 KB |
18 secs |
| 1-bit B&W |
300 DPI |
4.11 MB |
65 secs |
| 1-bit B&W |
600 DPI |
16.44 MB |
175 secs |
| 24-bit Color |
75 DPI |
6.17 MB |
59 secs |
| 24-bit Color |
100 DPI |
10.96 MB |
81 secs |
| 24-bit Color |
300 DPI |
98 MB |
* |
| 24-bit Color |
600 DPI |
394.4 MB |
* |
| 24-bit Color |
1200 DPI |
1.58 GB |
* |
| 24-bit Color |
2400 DPI |
2.4 GB |
* |
* Not enough memory to scan.
This shows that scan time
and memory required can vary tremendously with resolution. Color also takes
much more time and memory than B&W. Scanning is processor and memory
intensive, so the scan speed depends on the speed of the PC and the amount
of RAM installed. The memory requirements are such that even if your scanner
is capable of 1200 to 9600 DPI interpolated, you may find it impractical
to use such high resolutions except for special purposes. (See Zooming
In.)
Scan time may or may not
be important, depending on what your application is. If you're using the
scanner for photo imaging, you'll more likely spend much more time editing
and manipulating the photo scan than doing the scan itself. Scan time becomes
important if you're scanning in multiple pages for OCR or archiving purposes.
However for OCR work, the OCR processing time can be longer than the scan
time.
One problem with trying to
compare scan times is that there is no standard measurement. One manufacturer
may quote the scan time at 300 DPI of a 4 X 6 photo, while another may
quote the scan time for an 8 1/2 X 11 sheet at 100 DPI. Independent reviewers
will test all the scanners they're reviewing under the same test conditions,
but different reviewers may use different conditions. I've seen in some
reviews that the relative ranking of the speed of different scanners will
vary with the test conditions. The PC speed and configuration can have
a big effect on the scan speed. Some scanners may have built-in intelligence
and rely little on the PC speed, while others may heavily use PC resources
and will be very sensitive to PC speed. Scanner drivers can vary in the
way they use RAM and hard drive space to stoe scans, so the scan speed
can be greatly affected by the amount of RAM or the speed of the PC's hard
drive.
Scan
Resolution Rules of Thumb 
One of the most important scanner
specs is resolution. Resolution is measured in samples per inch (SPI) in
the horizontal and vertical directions. It is usually specified as DPI
(dots per inch) or LPI (lines per inch), but technically, what the scanner
is doing is acquiring a certain number of digital samples per inch. Except
for drum scanners, which use photomultiplier tubes, most scanners use CCD
arrays. The horizontal resolution is fixed by the number of CCD elements
across the page. The vertical resolution is determined by the stepper motor
step size. In sheet-fed scanners, the stepper motor moves the paper. In
flatbed scanners, it moves the CCD array. Moving the CCD array is a more
precision process than moving paper, so flatbed scanners can have higher
vertical resolution than sheet-fed scanners. Most scanner manufacturers
specify the optical resolution, which is the hardware-limited resolution,
but they also tend to specify the interpolated resolution, which uses software
to generate fake samples in-between real samples. It's mostly useful for
removing the "jaggies" from line art. Be wary of suspiciiously high resolution
specs. If you see a new <$100 scanner boasting "2400 DPI" resolution,
it's probably specifying the interpolated resolution.
How much resolution do you
need? It depends on your application. The more resolution, the more you
pay, so you don't want to pay for resolution you don't need. However, if
you have multiple applications for the scanner, get the resolution appropriate
for the most demanding application you think you MIGHT have. Keep in mind
that the important specification is image resolution. Image resolution
is the resolution of the final output. That depends on the initial sample
resolution and how much you blow it up or shrink it down. If you blow images
up, you'll need more sample resolution than if you use them actual size.
Conversely, if you shrink scanned images down, you need less sample resolution.
If you don't know what you need, get the highest resolution scanner you
can afford, or wait until the prices come down. Here are some rules of
thumb on what scanning resolution is needed for different applications,
based upon using the images actual size:
-
72 DPI is the number that has
tradtionally been quoted to be adequate for pictures to be published on
the Web at actual size. This will probably go up in the future as larger,
higher-resolution monitors and video cards become more common. 72 DPI is
the final resolution of the image when publishing it on the Web. You can
scan pictures at this resolution, but you may get better results by scanning
a picture at a higher resolution, then use a good image editor to touch
it up and resample it down to 72 DPI. There's nothing magic about the number
72. See "Why 72 dpi?"
for more information.
-
The resolution for video displays
varies with the video setting and size of the screen. (See Wayne Fulton's
"A few scanning tips -
Basics Part 1 - Video resolution.") A 17 inch monitor screen might
measure 12 inches horizontally. If it is set to 1024x768 screen size, then
the image is obviously 1024 dots / 12 inches = 85 dpi resolution in that
case. A 15 inch monitor at 800x600 might be 75 dpi. A 14 inch monitor at
640x480 might be about 65 dpi."
-
100-600 DPI is adequate for
graphics to be printed full-size on most inkjet or laser printers, depending
on the printer resolution and desired print quality. The scan should be
higher than the printer dot pitch. Too much scan resolution is overkill.
-
100-200 DPI B&W is fax quality.
The standard for normal fax resolution is 204x98. High fax resolution is
204x196.
-
100-400 DPI B&W is adequate
for OCR scans, depending on the font size. 300 DPI is good for 10-12 point
type, 400 DPI for 6-8 point type.
-
600 DPI and higher is for professional
publication-quality graphics or for zooming in and blowing up small portions
of pictures.
-
If you're scanning 35 mm slides
or negatives, you need need much higher resolution, since the originals
are relatively small. Slide/negative scanners tend to have resolutions
in the 2000+ DPI range. Some flatbed scanners have special (and often expensive)
transparency adapters. These can be used for scanning slides and negatives,
but unless you have a very high resolution flatbed scanner or are using
very large transparencies, the resolution is usually inadequate for serious
publishing work.
-
If you're doing high-quality
scans, you need a lot of RAM and a fast PC. See my tables below. Professionals
use PCs with 100's of MB's of RAM and GB's of hard drive storage. ZIP drives
are popular for offline storage, but some high-res images can take a whole
ZIP cartridge or more. Hard drive cartridges like the Iomega Jazz can hold
GB's and are fast, but are expensive. CD-R and CD-RW writers are a good
choice, since they have high capacities and low cost per megabyte. CD-R's
can be read by most PC CD-ROM drives. CD-RW's can only be read by newer
multi-spin drives. The main problem is that writing is slow. Tapes have
the lowest cost per megabyte and highest capacity, but random access is
slow. They're best for backing up large hard drives.
JPEG
Quality vs. Compression
The JPEG (Joint Photographic
Experts Group) format is the most popular file storage format for color
photographs on the Web. The graphic nature of the Web has promoted the
use of color graphics, but the bandwidth limits of most Web connections
has made it necessary to reduce the size of graphics files. JPEG has the
advantage that it can provide a tremendous level of file compression and
yet maintain the color and fidelity of photographic images. It stores 24
bits of color information, so it can provide much greater color fidelity
and range than the GIF format, which is an 8-bit (256 color) format. JPEG
is a "lossy" format, however, so some detail can be lost, unlike GIF, which
is lossless. The higher the level of compression, the greater the loss.
JPEG is most appropriate for photographs with smooth color transitions
and not a lot of sharp edges or details. It is not appropriate for line
art or text. It is also not appropriate for storing master copies of images
that will be edited later. Master copies are best stored as BMP or TIF
files. You should not edit or post-process JPEGs and re-save them as JPEGs.
You lose too much image quality that way. JPEGS should be created from
raw scan data or BMP or TIF files. I ran an experiment to see how much
space a JPEG file takes and how the quality varies depending on the level
of compression. I did one scan, sharpened it a little, and converted the
scan data to JPEGs using different quality levels. Here are the test conditions:
Scan size: 5.89" X 3.96"
(4 X 6 photo)
DPI: 21
BMP size: 122,894 bytes
Color: 24-bit
Pixels: 246 X 166
JPEG Quality vs. File Size
The following are the file sizes
for each JPEG quality level. Notice the tremendous reduction in file size,
even with the highest quality. The amount of compression varies with the
image. Images with little detail will compress the most. I chose a test
image that had a lot of detail in the middle third, less detail in the
bottom third, and almost solid color in the upper third. The sample image
is shown
below.
| Quality |
File Size |
| 5% |
1,528 |
| 15% |
3,113 |
| 25% |
4,475 |
| 35% |
5,903 |
| 45% |
7,021 |
| 55% |
8,053 |
| 65% |
9,488 |
| 75% |
11,559 |
| 85% |
15,204 |
| 95% |
25,981 |
Examples
of JPEGs at Different Quality Levels
The pictures below were scanned
in under the conditions above. The BMP file was saved as progressive JPEGs
at different quality levels. The lower the quality level, the higher the
compression rate and the smaller the JPEG file size. The BMP file is too
big too include, but looks identical (to my eyes at least) to the 75% quality
JPEG. Notice how the quality degrades with increased compression. Details
start to get blurry and artifacts appear in the sky near other objects.
JPEG at 75% quality,
11,559 bytes
This looks virtually identical
to the original BMP file, but is much smaller. 75" quality is a safe compromise
between quality and compression for most images.
JPEG at 55% quality,
8,053 bytes
This still looks acceptable,
but some artifacts can be seen in the hills at the left and along the horizon.
Some of the details are starting to get blurry. You can get away with this
level of compression if you have an image with large details, or where
distortions in the detail are not apparent, such as in pictures of trees
or grass. Also, you shouldn't have detailed areas next to solid-color areas
or else you'll see color artifacts in the solid areas.
JPEG at 25% quality,
4,475 bytes
The details along the horizon
and left side are very blurry. Detail is lost on the hills. The sky shows
some blockiness on the right side. It's marginally acceptable, and the
high degree of compression may be more important than the quality.
JPEG at 15% quality,
3,113 bytes
This looks a view through
a wet window. Much detail is lost. The sky is severely blocky, with many
color artifacts. This is probably unacceptable for this image, but there
may be some images where this level of quality works. The only way to tell
is try it and see.
My
Space-Squeezing Experience
If you go to my Bay
Area Back Pages you'll see a lot of scanned photographs, around 140.
I originally scanned them in at around 25-50 DPI, depending on how much
I wanted to zoom in on the original. I tried to scan them in so the bitmap
size would be around 100-200K. I then converted them to JPEG files. At
first, I compressed them with 75% quality and got them to around 15-20K.
However, I had so many pictures, I rapidly filled up the 2 MB of space
alloted me by Prodigy Internet. To get more space, I went back and recompressed
as many of them as I could. I did a no-no and re-JPEG'd JPEGs. I was too
cheap to save the original scans in TIF format and too lazy to hunt down
the pictures and re-scan them, but it still worked out OK. Since these
were small pictures, and since these were travel pictures, not art pictures,
I could tradeoff quality for quantity. I was able to squeeze out over 250KB
more space total (and make room for thisWeb page). I was able to squeeze
many of the pictures at 55% quality and still have acceptable (in my opinion)
quality. The filesize was reduced to around 8-13K, which is a huge reduction
from the original. Of course, if I didn't have a measly 2 MB to work with,
I wouldn't have had to go through all this trouble (excuse the shameless
plea for more space).
Zooming
In
The above photos were scanned
at a low resolution of only 21 DPI, which reduces the 4 X 6 original quite
a bit, but it gives a recognizable image and a JPEG of tolerable size for
Web use. My scanner has an optical resolution of 300 X 300 DPI, which is
way overkill for Web images. The only time that resolution might be used
for an online picture is for zooming in on a small area. In the above image,
I zoomed in on the white school buildings to the left of center, which
is an area of 0.49" X 0.26" and takes 134 KB as a bitmap. The image below
is a 65% quality JPEG of the image, which takes 9604 bytes.
The
raw image was somewhat blurry, so I sharpened it up with PhotoImpact. The
sharpening process enhances the edges of objetcts, but it also introduces
some "noise" into the picture. The original photograph was taken with a
$40 point-and-shoot camera on 35 mm film, processed at a discount store,
so it's not the sharpest original in the world. I don't know if it's true,
but I read somewhere that mass market photofinishers tend to print negatives
slightly out of focus to hide dust, grain, and scratches. This means that
if you scan these prints at high resolution, they'll be blurry. For maximum
resolution and quality, dedicated negative scanners are the best, but that's
only necessary if you're a very serious user or a graphics arts professional.
The point is that if you're using the scanner for Web images, you don't
need very much resolution. Even the cheapest scanners are adequate.
Reliability
Considerations
One good thing about flatbed
scanners is that you can easily look inside them and see how well they're
made. You can see if they use precision metal parts or a lot of cheap plastic
parts. You can get an idea of how rugged the construction is. Fortunately,
scanners are relatively simple mechanically. They are like inkjet or dot
matrix printers in that they have an active mechanism that moves linearly
in a precise manner. However, unlike printers, the mechanism moves relatively
slowly and infrequently. Whereas a printer's mechanism may have to move
back and forth at high speed a hundred times or more per page, a scanner's
mechanism only has to slowly sweep once per page (or 3 times for older
3-pass scanners). Thus, a scanner's mechanics are less likely to wear out
than a printer's, given the same quality of construction.
I would guess, based on the
inherent design differences between sheet-fed and flatbed scanners, that
flatbeds are more reliable. The sheet-feds have to handle paper and are
more prone to jamming, just like printers. Paper tends to shed particles,
which can clog the mechanics or dirty the optics. Dust and other contaminants
can get into the innards of a sheet-fed more easily than a flatbed, which
tends to be sealed up. It's like the difference between a floppy disk drive
and hard disk drive.
The component most likely
to go out first is the scanner's lamp. Many scanners, like my E3, use a
standard fluorescent lamp that stays on all the time to stabilize its color
temperature. Other scanners use cold cathode lamps that have 10,000 hour
lives, which is probably beyond the useful life of the scanner. On the
other hand, even though standard fluorescent lamps don't last as long as
cold cathode lamps, they are cheap to replace, about $5.
Scanner
Recommendations 
What's the best scanner to buy?
That's like asking what's the best car to buy. The best answer is:
it depends on a lot of factors. Here are some key factors to consider:
-
Primary Usage--determines type
of scanner and resolution
-
Price range
-
Scan resolution, bit depth
-
Scanning speed
-
Ease of setup
-
Portability, size, weight
-
Scan area
-
Bundled software
-
Software compatibility
-
Accessories (document feeders,
slide adapters, etc.)
-
Durability, reliability
-
Warranty, service policy
-
Manufacturer's reputation
Type of Scanner
For all-around general use at
home or in the office, the flatbed scanner is the best since it's the most
versatile. It may not be the best at all jobs, but it can dothe
most jobs. Some come with options such as auto-document feeders (ADF)
and transparency adapters, though these tend to be very expensive. If you
need automatic sheet-feeding, it's cheaper to buy a dedicated sheetfed
scanner,
though they tend not to have the scan resolution of flatbeds. On the other
hand, most ADF applications don't need high resolution or even color. If
you know you're only going to be scanning sheets of paper and you're short
on desk space, get a sheetfed scanner. If you don't know, what you're going
to use a scanner for, get a good flatbed that has an ADF option. Use the
scanner a lot and see if you need an ADF. If so, see if it's cheaper to
buy the ADF option or get a separate sheetfed scanner with built-in ADF
with the resolution you need. ADF is only appropriate for certain applications:
OCR'ing pages with text only, faxing documents, or copying pages. Other
applications require manually manipulating the image, often after each
scan, so ADF is redundant.
Most flatbed and sheetfed
scanners use CCD (charge-coupled device) arrays with a system of mirrors
and lenses to project the scanned image onto the array. A recent innovation
that does away with the mirrors and lenses is the CIS (contact image sensor).
CIS uses a long, thin array of sensor elements next to a row of color LEDs
that provide a light source. The advantage of the CIS technology is that
it allows the scanner to be very thin and light. It also uses less power,
which can allow these to be powered by batteries or by the power from the
USB port. As with any new technology, it is going through some growing
pains, so the image quality is still inferior to traditional CCD designs.
This may change in the future. For most home and office uses, the space-saving
(vertical height, not desk area) and the lower power is not a big deal,
so there's no desperate need for this technology. See "CCD
vs. CIS - Where the New Technolofy Hits--and Misses."
If you're a serious film
photographer and want the ultimate in picture quality for print publishing,
get a film scanner. Unfortunately, they're much more expensive than flatbeds
and are much more specialized. The professional-level ones are over $1000,
but the prices are coming down to the point where lower-priced models are
affordable by serious amateurs. They are still not quite mass-market yet
(and it's uncertain if they will ever be), so you don't see as many new
models or such intense price competition as in the flatbed market. HP has
one, the Photosmart, that is geared towards high-end consumers and is priced
below $500. It can handle not only slides and negatives, but color prints
up to 5X7. Other big players in this arena are the traditional camera companies,
like Kodak, Nikon, Minolta, Polaroid, Konica, and Olympus. All have models
above and below $1000. Microtek's ScanMaker 35T Plus is the venerable scanner
manufacturer's slide scanner. (See the Scanner
Manufacters links below.) Color slides have tremendous dynamic range,
much more so than prints. You need a good film scanner, preferably with
30-bits or more bit depth to capture and take advantage of that dynamic
range. Since film scanners are aimed at professionals or serious
amateurs, performance is more important than ease of setup. That's why
these scanners mostly have SCSI interfaces.
Interface
One important consideration
is what type of interface to use. The most common types are SCSI and parallel
port, with the new USB bus becoming more popular. SCSI scanners should
theoretically be faster than parallel port scanners, but the mechanical
speed of the scanner may make more of a difference, depending on the scan
resolution. Parallel port scanners are a lot easier to setup and can easily
be moved from one PC to another. That's why they've become more popular
than SCSI scanners for SOHO use. SCSI scanners are the best choice for
demanding professional users who need the highest performance and are willing
and able to handle the technical installation details.
SCSI scanners may come with
their own SCSI card. My E3, for instance, came with a low-end Adaptec SCSI
card that only officially supports one device. Other SCSI scanners require
you to provide your own SCSI card. SCSI cards can cost as little as $50
for a low-end model, to >$200 for high-end cards. Some scanners have their
own proprietory AT-bus interface cards; some are plug-n-play. In any case,
unless you already have a SCSI card installed, you need to open up your
PC, which can be a pain for some users. SCSI cards also require a precious
interrupt, which may not be available if you already have a lot of peripherals
on the system (not that the archaic PC architecture has a lot of interrupts
to spare - don't get me started on a diatribe about this).
There are many different
kinds and price ranges of SCSI interface cards. The cards that are typically
bundled with low-cost scanners are very simple cards that are intended
to be used only with the scanner. They run in programmed I/O mode, which
means the CPU has to get involved with each bye of data transferred. The
Adaptec card that came with my Microtek E3 is one such card. While scanning,
it totally ties up the CPU, so all other processes are frozen. For more
money, you can get a general-purpose bus-mastering SCSI interface card
that will not tie up the CPU as much while scanning. It can also interface
with more than one device. This is an advantage. If you can spare an interrupt
to install the SCSI card, you can plug multiple devices into that SCSI
card and only use that one interrupt. However, SCSI peripherals can be
tricky to set up. There are several flavors of SCSI (SCSI-1, SCSI-2, Ultra-SCSI,
wide SCSI, etc.), so you can run into complications if you try to drive
different types of SCSI types from the same card. Different types of SCSI
use different connectors, which are not compatible. There are adapters
available to convert from one type to another (e.g. 68-pin to 50-pin),
but they tend to be expensive. SCSI also has cable-length limitations
and termination requirements that you have to keep in mind.
If you already have an interrupt
allocated for the parallel port, you don't need another interrupt to run
the parallel port scanner. These scanners have pass-through connectors
so you can hook up your printer to the same port. If you also have a parallel-port
Zip drive, you could (theoretically), daisy-chain all three. However, this
can have compatibility problems, particularly with the printer. You could
use a switch box, but that can also cause problems, depending on how picky
your printer is about signal quality. You also have to remember to manually
set the switch box to the right position. Like SCSI, the parallel interface
has cable length limitations, but they are less well-defined than SCSI.
Usually, it's the printer that's the pickiest about the cabling. See "SCSI
or Parallel?" by Graeme Bennett.
The easiest interface solution,
in my opinion, is the new Universal
Serial Bus (USB). This bus is found on most new computers and is supported
by Windows 98. It's not as fast as SCSI, at 12 million bits/sec, but it's
much easier to set up and is faster than most parallel ports. It's also
much more expandable. Theoretically, you could hook 127 USB devices on
the same bus, not that you'd want to or could afford to. USB connectors
are smaller and easier to hook up than parallel port or SCSI cables. They
are also thinner and can be longer. The USB port provides power to low-powered
peripherals like joysticks. Most scanners use too much power to be powered
from the USB port, but with the new low-powered CIS technology, this may
change. USB is relatively new, but more and more peripherals are coming
out with it. A USB scanner would be my choice if I had a new PC and needed
a new scanner. If you have an old PC and don't have built-in USB ports,
you can add a USB card. Newer Macs, like the iMac, also have USB ports.
If you have a Mac, you need to make sure the software that comes with the
scanner supports it.
You have to decide on what
your priorities are and read the reviews and manufacturers' specs to see
how different scanners compare in each of these factors. See the links
below to go to those specs and reviews. Unless you have limited specialized
applications, the best type of scanner to get is a flatbed. It's the most
versatile, but it does take a lot of space. Personally, I like my Scanmaker
E3. It works great. The software bundle is very good, especially PhotoImpact.
I haven't yet had a need for a higher optical resolution than 300 DPI,
but my HP660C printer is limited to 300 DPI. However, the E3 is not state
of the art anymore. For a few dollars more, you can get a 600 X 300 scanner
by several companies. 600 X 600 and even 600 X 1200 scanners for reasonable
prices are becoming more common.
Scanner
Information and Reviews 
-
Welcome
to the DPI Electronic Imaging WWW Site! This is by far the best site
I've seen on scanners. It links to many sites, including Wayne Fulton's
scanning tips, mentioned in the Introduction. His tips link to many scanner
reviews.
-
Scanning
Photographs Good scanning advice from Photodex.
-
FamilyTested
Hardware: Scanners Top rated: HP Scanjet 5PSE, best buy: Microtek Scanmaker
V300, 11/97.
-
How
to Buy Flatbed Scanners: Introduction. Recommended: Microtek Scanmaker
E3 for value, HP Scanjet 5PSE for performance.
-
Umax
Astra 300P Color Flatbed Scanner
-
Scanning
from a Flatbed Review of Microtek Scanmaker E3
-
Scanning
Tips from Jasc Software
-
Affordable
Flatbed Scanner Summary 12/96 "If ultra high quality is imperative,
we would choose the Agfa StudioStar with full Photoshop. If you're on a
budget, go with the Pacific Image Electronics ScanAce."
-
ZDNet
Tech Review: Scanners Reviews Opticpro 4800P, Microtek Scanmaker E6,
HP Scanjet 4C.
-
Review:
Info Peripherals ImageReader FB Flatbed Color Scanner, $330 street, PC
Magazine, 3/4/97
-
How
To Buy A ...Flatbed Scanner - August 1997
-
Scanner
Has a Billion Pluses - August 1997: Plustek OpticPro 9630P 30-bit color
flatbed scanner
-
HP
Network ScanJet 5 - August 1997
-
Tricks,
Tips and Techniques for Superior Scanning - August 1997
-
Prices
Fall, Quality Soars on Desktop Scanners (2/7/97)
-
Features:
Scanners for the Rest of Us (1/97) "Among flatbed models, the Hewlett-Packard
ScanJet 4P won out..for a sheetfed scanner...Storm Technology EasyPhoto
SmartPage"
-
How
Low Can They Go? Color Scanner Prices in a Free Fall (5/22/97)
-
Computer
Shopper: Microtek ScanMaker E3 (6/96)
-
Capture
the Color PC Magazine March '97. Recommends: Mainstream business and
SOHO/home users: UMAX Vista-S12. Graphics professionals: Microtek ScanMaker
III.
-
When
a Flatbed Scanner Is Too Much (Sheetfed scanners, March 1997)
-
Flatbed
Scanners: Reviews At A Glance Links to reviews of 10 scanners
-
Fortune
| Technology Buyers Guide (updated periodically)
-
Computer
Life: HP ScanJet 4p (May 1996)
-
PC
Computing: HPScanJet 4p (April 1996)
-
Computer
Shopper: Mustek Paragon 800 SP (June 1996)
-
High-Resolution
Scanning PC Today Processor - HP Vs. Microtek Vs. Agfa Vs. Pacific
Image Electronics Vs. Mustek Vs. Ricoh
-
Scan
Perfect: Benchmarks Results of testing 11 scanners, June 1996
-
Mustek
Frequently Asked Questions
-
11/24/97
SCANNERS THAT FIT YOUR DESK--AND YOUR BUDGET "if you have room--try
a flatbed scanner, such as UMAX Technologies Inc.'s new Astra 610S for
$149 or Optic Pro 9630P from PlusTek, for $149. We also recommend HP's
$299 ScanJet 5p."
-
Yahoo!
- Computer Buyer's Guide Top Sellers: Scanners
-
How
to Get the Most out of Your Scanner by Caere Corporation
-
SOHO
Scanners - PC Magazine 11/4/97
-
Value-Minded
Flatbeds - PC Magazine 11/4/97 "The $250 Visioneer PaperPort
6000, our Editors' Choice in this roundup, had the best combination of
software and performance."
-
ScannerUser:
Tips & Tricks -- ZDNet Products
-
Agfa
Scanning Publications
New Links
-
ScannerUser:
Desktop Scanners -- ZDNet Products Links to scanner reviews 12/18/98
-
Make
a Connection with USB -- ZDNet Products 1/6/99
-
Stump
Jeff: Adding USB Ports To PCs And Macs (12/23/98)
-
28
Scanners: Cheaper, Smaller, Better -- ZDProducts 1/25/99
-
Small
Business Advisor - Buying Scanners for Your Business
-
Instant
Image: Digital Camera, Scanner, & Printer Superguide -- ZDNet Products
10/21/98
-
Scanner
Section Index by Graeme Bennett
-
More
Scanning Solutions by Graeme Bennett, 1/5/99
-
Scanner
Q&A by Graeme Bennett 10/27/98
-
Review:
ScanAce 1236s Macworld 7/98
-
PPN:
Review: Artec ViewStation AS6E 3/98
-
ZDNet
Scanner Reviews
-
ZDNet:
Artec ViewStation AM12E 1/25/99
-
Visioneer
PaperPort One Touch 5300 1/11/99
-
Review:
StudioStar 6/17/98
-
PC
Magazine: Desktop Scanners - 10/20/98
-
Buyer's
Checklist - Scanners - Computer Shopper
-
Scanners
for Your Business - ZDNet Small Business Advisor
-
Getting
Pictures into Your Computer - ZDNET 8/26/98
-
Camera
equipment and Scanners Review - A FLAAR Website
-
TechWeb
Scanners & Input Devices - reviews, specs, prices, comparisons, and
ratings
-
The
PC Technology Guide - Scanners
-
DIGITIZATION:
A Literature Review and Summary of Technical Processes, Applications and
Issues
-
Professional
Presence Network - Reviews
-
Xerox
DocuImage 620S
-
USB
Central
-
WebShopper
Scanners Roundup 6/8/98
-
PrePRESS
Technology Report - Scanners, Digital Cameras, and Photo CDs
Film Scanner Information
-
HP
Photosmart Photo Scanners
-
Digital
Scanner Reviews: HP Photosmart, Nikon LS-2000, Olympus ES-10 film scanners
-
CNET
Reviews - Just In - Hewlett-Packard PhotoSmart PC Photography System
-
Tony
Sleep - Home Page (see section on film scanners)
-
More
Scanning Solutions - HP Photosmart Scanner
-
Nikon
Slide Scanner (Cameras Scanners.org)
-
Graphic
Arts: Nikon updates Coolscan 3/19/98
-
Slide-Scanner
Spectrum Macworld Magazine 12/95
-
Olympus's
High-Quality Slide Scanner (PC World Online 1/98)
-
Photo.Net:
Slide Scanners

OCR Software 
OCR stands for Optical Character
Recognition. It's the process of scanning printed or even handwritten materials
and converting the graphical representation of the text into character
or numeric data. The data can then be manipulated by such programs as word
processors, spreadsheets, and databases. Books and articles can be scanned
in and converted to text files. The text files can be orders of magnitude
smaller than the bitmapped scan data. Also, the text files are capable
of being searched, sorted, copied, and filed.
The process of optical character
recognition is not an easy job. It requires a tremendous amount of computer
power, and it's only with the more powerful processors like Pentiums has
it become practical for home computer use. If you consider all the thousands
of fonts that are available, in all different sizes and spacing, the OCR
program has to be very flexible to be able to recognize them. Some characters
are very difficult to tell apart, such as 1, I, l, 0, O, and o, especially
with mixed fonts of different sizes. The better OCR programs can be "trained"
to improve recognition quality, especially with unusual fonts. The quality
of the original is alsoimportant for accurate scanning. Dirt or smudges
on the original, or copies with blurry or broken letters are also difficult
to scan. Decimal points can get lost if they're too small, or spots can
appear as periods in the wrong place.
The hardware requirements
for good OCR work is not too demanding. The scan resolution required depends
on the size of the text. The smallest font size that can be scanned effectively
is usually around 6 points. At this size, you need to scan it at 300-400
DPI. For bigger fonts, you can and should scan at a lower resolution. Over-scanning
doesn't help and can even hurt. The OCR program has to sift through all
that scan data to recognize the characters. No use overloading it with
unnecessary samples. Normally, documents are scanned as line art.
The accuracy of the OCR process
is mostly dependent on the software. There are many OCR programs. Most
scanners come bundled with one. The OCR software bundled with lower-end
scanners are often simpler versions or out-of-date versions of full-featured
OCR packages. Which is the best OCR program? New programs and updates appear
regularly, so the situation changes constantly. In general, Caere's Omnipage
and Xerox's Textbridge have gotten the best ratings. A lite and old version
of Omnipage came with my Scanmaker E3. It does a good job on text in the
10-point range. However, when I tried it on 6-point type, it made a lot
of errors. I didn't try optimizing it, however. I later tried the latest
and greatest version of Xerox's Textbridge (on a UMAX 600P scanner) with
the default setup, and it read the same text with few errors. Many OCR
programs, including Textbridge, allow for training the OCR to improve its
accuracy. An OCR program can be setup to feed text directly into a word
processor. It can appear as a menu item in the File menu of a word processor.

Image
Editing Software 
Image editing software is essential
for getting the most out of a scanner. Most scanners come with bundled
image-editing software. Often, they are low-end versions of full-featured
editors. The industry standard for photo image software is Adobe Photoshop.
It has every feature imaginable, and is the standard against which all
other programs are compared. However, Photoshop by itself can cost around
$500-$800--several times more than a low-end scanner by itself. Adobe has
a "lite" version of Photoshop called PhotoDeluxe that's bundled with low
to mid-range scanners in the $100-$300 range. Photoshop itself is bundled
with mid to high-end scanners in the $800-$1000 range and above. Ulead's
PhotoImpact (special edition) is a fairly powerful program that comes with
several scanners, like my Microtek E3. It has lots of image processing
functions and special effects, but unlike the Adobe products, it does not
have layers. Layers allows you to combine multiple images, graphics, and
text, with one overlayed on the other. Some features of image editing software
include the following:
-
Lightening, darkening the image.
-
Adjusting the contrast.
-
Sharpening, softening the image.
-
Changing the color balance.
-
Changing the dimensions and
resolution of an image.
-
Rotating or distorting an image
or other special effects.
-
Cloning: copying one portion
of an image to another. This is a powerful tool for image editing, such
as removing unwanted details.
-
Combining multiple objects,
including text, onto one image.
-
Using "paint" tools to retouch
the image.
-
Erasing parts of the image,
"bluescreen" effects.
-
Removing "moire" effects--patterns
that can appear when scanning halftoned images from printed materials.
-
Adding frames or fancy edges.
-
Converting and saving the image
to different file formats.
-
For more information on how
to use some of these features to improve scan quality, see A
few scanning tips - A Simple Way to Get Better Scans>
Image Editing Software Links 

Scanner
Manufacturers 

Computer Art, Graphics, and Photography 

Retailers
and Distributors 
-
Armorware:
A wholesale shop specializing in laptops, Canon printers, scanners, and
computer accessories.
-
CDW
-
Central
Computer Systems - Application Software
-
Welcome
to CompUSA Direct Shopping
-
Compulink
Compushop - Top Ten - Scanners
-
Compumart
Home Page
-
ComputerESP
Search
-
Welcome
to Computer Graphics Technology!
-
Computown:
Scanning Devices
-
Digital
Darkroom from Ted Pella, Inc.
-
Electroweb:
Scanners, Modems and More
-
Envisions
Online
-
Graphtronix
-
Hi-Tech
USA: SCANNERS
-
All
LCD projectors, presentation products & support materials for presentations
- Presenting Solutions
-
Outpost.com
-
Provantage
-
PSI's
Color Scanners for Desktops and Notebooks
-
QCI
Products - Scanners
-
The
Scanner Guys WEB Page
-
Tech
Shopper: Scanners & Input Devices
-
Bob
Weber Homepage for Scanner, agfa, Linotype, RIP, Harlequin (Professional
scanners)
-
Publishing
Perfection
-
Primary
Simulation, Inc.
-
ScanHelp.com
- The Scanning Software Site
-
Cavin's
Document Imaging Division
1999 Update
Note:
This is the first update for
this page in over a year. Most of what I wrote above was done at the end
of 1997. Since then, many things have changed in the scanner world. Prices
have continued to come down. I predicted that scanners would become more
common and even bundled with PC systems. This has been happening. Prices
have come down to the point where you can get a decent scanner for under
$30. I had a table of scanner prices, but they're obsolete now, so I deleted
the table. Prices change so fast, I can't keep up with them, so I'll try
to avoid mentioning specific prices.
Recent hardware innovations
that have popped up since the last update are the Universal Serial Bus
interface and the Contact Image Sensor (CIS) technology, which I discuss
above. I also got a new computer with a USB bus, but I haven't seen any
reason to replace my good old reliable SCSI Microtek S3 with a hot new
USB model.
I also added some more information
about film scanners. I checked and deleted some of the dead links in the
reviews links, but kept some of the old ones since some of the information
is still valid, if not the prices. I updated the manufacturers links. Some
manufacturers (like Storm recently) have died or changed names. New ones
have popped up. The old reliable brands (HP, Microtek, Umax, and Mustek)
are still plugging along. I added a few new retailers. I haven't gotten
around to checking all the other links, so if there are a few dead ones,
I'll prune them out later.
I admit that when I first
created this page, I went overboard on animated GIFs and horizontal rules,
and it's way too long. If I ever get time, I'll re-design and re-organize
this page. I'm still dabbling with Web page styles. To see one of my latest
pages, go to Bay Area Biking or the even more
recent North Coast and Redwood Empire pages.
Click
here to go to Ron Horii's Bay Area Back Pages (lots of scanned photos)
Previous version 12/11/97.
Latest Update 3/10/99