[colug-432] Photo digitization recommendations?

Peter Kukla fruviad at yahoo.com
Tue Jan 17 17:22:40 EST 2017


I'm not going to try to reply to each point raised by each of you, but I will say that having someone else scan the pictures sounds like it might be the best bet.  All of you have given me good information to consider...thanks.
I'll note that some of the pictures I've uncovered so far are dating back to the 1930's, or even earlier.  I'd be very leery of damaging them with an automatic feeder.  However, I'd also be happy enough to do them a few at a time, making a longer-term project of it (and that appeals to my cheapskate nature.)
More research lies ahead, I think.  I'll be better prepared now that I have this additional information.
Thanks again...

-peter

      From: Jim Wildman <jim at rossberry.com>
 To: Central OH Linux User Group - 432xx <colug-432 at colug.net> 
 Sent: Tuesday, January 17, 2017 11:22 AM
 Subject: Re: [colug-432] Photo digitization recommendations?
   
If you have the negatives, negative scanning might be an option as well.  My wife worked at Walmart photo for about a year and managed to
scan all of our pre-digital negatives to CD on her lunch breaks.

On Mon, 16 Jan 2017, Rick Hornsby wrote:

> 
> I'll give you my feedback as a photographer, but not necessarily as someone who has tried to come up with a shared/distributed end-to-end
> system like you're aiming at here.
> 
> On January 12, 2017 at 12:32:19, Peter Kukla (fruviad at yahoo.com) wrote:
> 
> Hi COLUGers,
> 
> I have hundreds -- maybe thousands -- of photographs that I'd like to scan.  My goal is to digitize the whole batch and distribute them
> to the family members of those in the pictures.
> 
> For the "thousands" volume you're looking at, you won't want a flatbed scanner. It is going to take you freaking forever. Even if you put
> multiple images on the scanner bed at once, you still have to now digitally cut them back into individual photographs.
> 
> You definitely _do_ want a flatbed scanner for old, damaged, or delicate prints that can't be put through an automatic document feeder type
> system. If you use an ADF, make sure it is one designed for photographs. A normal sheet feeder may mangle your one-of-a-kind print.
> 
> Photographers digitizing images will sometimes use an actual camera to photograph the print. This is requires a light box, and is labor
> intensive, but can provide a very high quality result and can reduce the risk to the print.
> 
> With that said, I strongly suggest finding a local vendor that can do the digitizing for you. Unless you just have a lot of time on your
> hands.
>
>      I have a Linux-based household, so the hardware & software would need to be Linux-friendly.  Hence my bugging you guys.
>
>      The project will require a fast scanner, given the number of photos in question.  Waiting 2-3 minutes for each scan to complete is
>      fine if I only have a few pages to scan, but not for a large project like this.
> 
> You can't do a fast _and_ quality scan of a print photograph with off the shelf gear. Remember that a print is an analog medium with (a
> comparatively) massive resolution. Your best bet, honestly, is to give your prints to a vendor that specializes in digitizing them. From
> there, you can take the digital versions and do whatever you like with them - cataloging, setting metadata, allowing others to set metadata
> through a web UI, etc. You can ask the vendor to provide you the original TIFF images (they're going to be huge files), and then you can use
> something like imagemagick to convert them down to JPEG for the next steps in your process.
> 
> If you decide to do it yourself, you're going to have to be patient and it is going to take a long time to scan them all.
> 
> You can save yourself filesize by converting the TIFFs down to monitor-ready 72dpi JPEGs once they're scanned, but to transfer the analog into
> digital initially, you want as much resolution as your scanner will offer for a color print. If you set the scanner to capture at 72dpi, your
> images will look like garbage. For preserving as much quality as possible, I'd also probably not let the scanner software do the JPEG
> conversion even if it offers that feature.
>
>      Once scanned, I'll need to catalog the photos, describing who is depicted in each one.  In the interest of ensuring that the
>      metadata is associated with the photos, one thought is to embed the metadata via IPTC tags, although I haven't explored that
>      option very deeply yet.
> 
> 
> IPTC tags are a good choice. Take a look at the tags and tools for managing EXIF/IPTC data. For library and CLI usage, exiftool is one of the
> best I know of:
> 
> http://www.sno.phy.queensu.ca/~phil/exiftool/
>
>      However, the photos being scanned also are mostly of people I don't know (many pictures are from the wife's side of the family)
>      and the family members who would know are widely dispersed, geographically.  It would be nice to have some sort of web-based, open
>      source solution where I could load the images into a database of some sort and allow users to tag the pictures with details that I
>      may not know ("Hey...that's Uncle Gump at our 1973 Grand Penguin Ball!")
>
>      Anyone ever done anything along these lines?
>
>      I'm looking for recommendations & advice for:
>
>          * Good, reliable, and fast flat-bed scanners that are Linux-friendly
> 
> The best I'm able to give you there is to point you at something like this: http://www.pcmag.com/article2/0,2817,2362752,00.asp
> 
> Here's what else I'd say about that: you are free to insist on Linux if you wish. However, you may have difficulty finding software that
> provides an efficient workflow pattern. That is, I think you can use GiMP to scan things, but you'll probably want to smash your mouse with a
> big hammer trying to use GiMP for the volume you need. It simply wasn't designed for that. Chances are you'll have much better luck finding
> the right tool for volume scanning in Windows or MacOS. You may have to bite the bullet and buy a Windows license for this project. (I think
> you can d/l and use Windows 10 for 30 days before it starts harassing you.)
> 
> To put it another way, I can use a screwdriver to apply and sand drywall mud because damit everyone in the house has a screwdriver and it's
> the best tool there is. But holy crap it's going to be a pain in the ass to fix a head-sized dent this way.
>
>          * Preferred image formats (the IPTC aspect may reduce the number of options?)
> 
> TIFF for the archived originals, JPEG for everything else. JPEG without question, whether you decide to keep the TIFF originals or not. As far
> as JPEG compression, 85% quality is right around the breaking point where the larger filesize penalty starts to hit without a discernible
> image quality retention. You can go up to 90%, but at 72dpi I don't think it will matter. You could go down to 80%, but it's really not worth
> it. Anything lower and your photos will start to look bad.
> 
> I should mention that once you drop the image resolution down to 72dpi, the resulting file will not be suitable for printing. Yes, the local
> drug store will still print it for you, but the quality is going to be poor. Creating prints from digital images, especially anything larger
> than small sizes like 4x6, needs an image resolution of at least 240dpi.
> 
> Side note: if you want a quality print, please don't go to the local drug store. They produce awful results. I mean, really awful. Use a
> professional print service like mpix or one of the others.
> 
> I suggested dropping down to 72dpi because that's typically the highest useful dpi that can be displayed on a tv or monitor. Anything higher
> is just wasted filesize - including on a website where people are just manipulating the metadata. You can always copy the metadata from the
> screen-resolution images to your high res images on the backend.
> 
> JPEG is an excellent format for storing photographs. TIFF is superior quality (lossless compression), but a very large file size. PNG
> compression is lossless as well, but to handle the larger color palette of a photograph means a significantly larger file size, or a smaller
> color depth which is bad for a photograph. PNGs are much better suited as GIF replacements or for screengrabs. It's not good for photos.
> 
> JPEG can handle the EXIF/IPTC data. However, here's something important to remember about JPEG images: because the compression is lossy, every
> image edit incurs a quality penalty in the resulting saved file. If you need to make image edits (crop, color correction, etc) - do those in
> the original TIFF image and then export to JPEG. Don't open the JPEG, edit the image, save/close it, and then come back later to do more
> edits. At a low compression ratio (high quality), you can get away with doing it a few times. Even at 100% quality setting, there's still
> lossy compression and thus still a quality penalty.
> 
> This penalty does not apply to editing the metadata using things like exiftool. It only applies to the image itself.
>
>          * A web-based digital archive system that supports user feedback and possibly downloading of the original images
> 
> Feedback and downloading the original, yes. There are tons of both self-hosted and "cloud-based" solutions. Of these, which allow unlimited
> access to edit the IPTC fields? That may have to be something more home grown, I'm not sure.
> 
> If you plan to allow them to download the original from the website and you're rolling your own solution, then plan to have two different JPEG
> image files - one a medium resolution (1600px on the long edge should be good) for displaying in the webpage, and the high resolution JPEG
> file they can download. You may want the high res image to have a higher preserved dpi from the TIFF as well. 4000px on the long edge at
> 200dpi should be sufficient for most people's small print needs.
>
>          * Any pitfalls I may encounter that I haven't yet thought about
> 
> Remember to think about your audience when designing a UI for them. These will almost certainly be non-technical people who won't understand
> the IPTC labels, and who will probably make mistakes when entering the data. It might be helpful to design it in such a way that there's some
> editorial control. That is, if possible for example, try to ensure grandma can't accidentally overwrite Aunt June's valid/good description
> with a copy/paste of your email address. Figuring out ways to manage the data and manage the user input of your system will probably end up
> being more work than you anticipate. Then again, it's very possible someone has already solved this very thing and there's a suitable product
> out there.
> 
> To preserve your sanity, it might be helpful to spend some time with your wife or your geographically proximate family to knock out what you
> can, before crowdsourcing to the family what you can't.
> 
> Good luck, and let us know what you come up with.
> 
> -rj
> 
> 
>

----------------------------------------------------------------------
Jim Wildman, CISSP, RHCE      jim at rossberry.com http://www.rossberry.net
"Society in every state is a blessing, but Government, even in its best
state, is a necessary evil; in its worst state, an intolerable one."
Thomas Paine
_______________________________________________
colug-432 mailing list
colug-432 at colug.net
http://lists.colug.net/mailman/listinfo/colug-432


   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.colug.net/pipermail/colug-432/attachments/20170117/f3bc47b9/attachment-0001.html 


More information about the colug-432 mailing list