Website Issues

ti_mug
99er.net Mug

Over the last month, there have been a few problems, but I think I’ve finally gotten it all straightened out.  Chris Schneider’s site is being hosted here now, and the software we wanted to use to send out his email newsletter wasn’t working.  So, my hosting provider suggested that they move me to a new server to fix the problem.  That caused a few new problems – the worst of which were new errors on Ninerpedia.  I also lost everything that I had updated since December.  But, Michael Zapf got Ninerpedia up and running again, and I think that I’ve gotten just about everything else back to where it was.  Oh, and the newsletter software for SHIFT838 is working, so that’s good.  Sorry for any inconvenience, and  thanks to Michael Zapf for all of the work on Ninerpedia.

Scanning and Editing Documents

One of my goals has always been to scan all of the TI99 documents in my collection, and upload them to my site so that others can enjoy them.  When I first started scanning things many years ago, the only way that you could make PDF documents was to pay a lot of money for Adobe software.  That wasn’t an option for me, so I used the free software that came with my Visioneer scanner to save scans into the MAX document format.  I’ve tried to convert most of the MAX files on this site to PDF format, but if you come across any on ftp.whtech.com, you can still view them using the old MAX viewer.  You can download a copy here.

Since then, the Visioneer software has changed into PaperPort, and I’ve purchased several licenses over the years.  The nice thing about PaperPort is that it while you can use it for scanning, editing, and ordering pages in documents, it allows you to save a document as a PDF file.  It also handles MAX files, so I’ve used it to convert them to PDFs. When I scanned my TI-99/4 manual, I didn’t want to break the binding, so I made one of these DIY book scanning rigs and used my digital camera to take a jpeg photo of each page.  That’s when I became familiar with ScanTailor.

ScanTailor is free, and does an amazing job of cleaning up and organizing document images.   ScanTailor only handles document pages in TIFF or JPEG (perfect for using with digital cameras).  If the pages are already in JPEG format like mine were when I was using my camera, you just put them in one directory.  If they’re in a different format, you’ll have to convert them- PaperPort is great for this, as you can ‘unstack’ PDF files, and then save them as JPEG or TIFF.   When you start ScanTailor, you point it to the directory that you’ve saved your pages in.  ScanTailor then lets you reorder and process these pages.  There are 5 processing stages:  Fix orientation (rotate the page), split pages (if there are 2 pages in each image, tell ScanTailor where the split is between the 2 pages), deskew (correct tilted pages), select content (show the program where the content is on each page.  It tries to detect it, but sometimes it doesn’t catch the page numbers in the corner, i have to drag the selection box to cover the page numbers), and  margins. After that, you are given an opportunity to despeckle (remove stray dots) and dewarp (fix parts of the image that are curved because a page wasn’t flat)  and then start processing.

The pages are placed into an ‘out’ directory as JPEG or TIFF images.  If the despeckle function didn’t clean up some of the pages enough, I’ll open them up in a graphics program (I use  Paint.NET – it’s free) and erase any stray marks.  Then, I import the images into PaperPort, ‘stack’ them in the correct order, and save them as a PDF.

The last step is to run the PDF file through an OCR program so that you can search within the PDF.  I’ve been using the free version of PDF-XChange Viewer which you can download at http://www.tracker-software.com/product/pdf-xchange-viewer. Once you open a PDF with it, click on the ‘Document’ tab, and choose ‘OCR Pages’.

I’ve got a bit more on this, so I’ll post it in a ‘part 2’.  If you do things differently, or you have any suggestions, I’d like to hear them.  Please post comments!

-Rich