After processing a box of slides, we have a set of images with all the imperfections of the scanning intact; the scanned part of the mount is dark, but definitely not black, and the ragged edge of the mount (if cardboard) seems much more conspicuous on the computer screen than it did when the slides were projected. It’s time to set some image processing software to work. We need to be able to do at least the following:
- Write a C program to find the edges of the image in the raw slide scan and centre it, so that subsequent steps can work with fixed positions. Necessary because slides don’t come to rest at the same place every time in the scanner. Not discussed further here.
- Replace the scanned part of the mount with a pure black frame. In fact, replace slightly more than just the mount, so that the margins of the visible slide, with the rough edges of the mount and their projecting fibres, are replaced by a smooth line.
- Round the corners of this frame when the mounts have rounded-corner windows.
- Store identification text in the margin area.
- Finally, because it would be impractically tedious to perform the actions identically by hand for a large number of slides, automate the procedure via some sort of batch processing.
Candidates for the job are, in the closed-source world, Photoshop, and in the open-source world, the GIMP. The former gets ruled out early because of its high price. By all accounts, it’s a substantial piece of work, but I don’t want to spend that much money to find out down the road that it doesn’t suit me. Anyway, all I have is Windows 98, where recent versions won’t run. So it’s the GIMP that is going to be looked at, and I don’t need to do anything to get hold of it; it comes with most major Linux distros. That means there are four versions on my multiboot machine at the time… version 1.2.3 on Red Hat 9, version 2.2.13 on SuSE 10.2, version 2.4.0-rc3 on Kubuntu 7.10 and version 2.6.1 on Ubuntu 8.10.
The obvious one to use would be the latest version, but there is a problem. Ubuntu 8.10 isn’t working that well for me. The GUI goes down sporadically, leaving only the mouse pointer reacting; the keyboard is disabled, and clicking doesn’t work any more. I have to go to another machine and do a remote shutdown. Out of the other distros, version 1.2.3 can be dismissed because its scripting has no file-glob procedure, so processing of multiple files is impossible. Also, it pops up no less than six windows when activated on an image (…what were they thinking of?). 2.2.13 has gotten over that, but the version of scheme is old, and has some technical details I don’t fully understand. So 2.4.0-rc3 it’s going to be. Kubuntu 7.10 is my workhorse distro, the only one I have had up to that point which does everything I need to do, and never crashes. It’s running more than ninety per cent of the time, so I won’t have to reboot.
For documentation, we have the in-program help, the GIMP project’s Web documentation and tutorials and whatever else we can find on the Net.
It is the work of a few minutes to fire up GIMP 2.4.0-rc3 and verify that the first three requirements in the task list above are satisfied. You can do all kinds of selections within the image, you can “bucket fill” black into a selected area and you can store text wherever you want. In fact, using the GIMP interactively is quite impressive, if not very intuitive. Evidently a lot of work has been put in to provide a huge array of features. What about batch processing, though?
We need first of all to get to know the scripting language. Go to the the GIMP Web tutorials to see what help there is. Sure enough, there is something, though meagre. Two examples only, the second being a example of how to do the first repeatedly over a set of images. That’s the one of interest. Here it is, copied verbatim from their site…
(define (batch-unsharp-mask pattern radius amount threshold) (let* ((filelist (cadr (file-glob pattern 1)))) (while (not (null? filelist)) (let* ((filename (car filelist)) (image (car (gimp-file-load RUN-NONINTERACTIVE filename filename))) (drawable (car (gimp-image-get-active-layer image)))) (plug-in-unsharp-mask RUN-NONINTERACTIVE image drawable radius amount threshold) (gimp-file-save RUN-NONINTERACTIVE image drawable filename filename) (gimp-image-delete image)) (set! filelist (cdr filelist)))))
Good, let’s get hold of that second one and run it. That means copying it into the GIMP’s script directory under the user’s home directory so that the GIMP can “see” it when it starts up.
The language in which the script is written is Scheme, a close relative of LISP. LISP is a relatively technical language, popular among academics and software theoreticians because they can prove theorems about it. The choice of a language like this for scripting closes off the writing of scripts to all but a tiny minority of prospective GIMP batch users in my opinion. All the same, taking a closer look at the script, using a bit of imagination and some familiarity with Unix concepts such as “globbing”, an only moderately software-savvy person can guess that file-glob is going to give us some kind of list of files matching a pattern like "*.png", then filelist will hold this list and the while loop will process one file, chop it off the list and continue until the list is empty. So go to a directory with some images in it and tell the GIMP to execute the command exactly as given on the Web page. Oops.
> gimp -i -b '(batch-unsharp-mask "*.png" 5.0 0.5 0)' -b '(gimp-quit 0)' batch command: experienced an execution error.
Very helpful, eh? Well, there is a script-fu “console” which can be called up from the GIMP itself, and where the error reporting is said to be better, so fire up the console, dig the main command out of the first pair of quotes above, and enter it. Result:
> (batch-unsharp-mask "*.png" 5.0 0.5 0) Error: car: argument 1 must be: pair
Which “car”? There are lots of them. Here we go again… start removing parts of the job until the error goes away. Well, the end result of a lot of groping around can be summarised by the following command results on the console, starting with the file-glob excerpt from the script:
> (file-glob "*.jpg" 1) (2 #( "Richq.jpg" "GateFS.jpg" )) > (cadr (file-glob "*.jpg" 1)) #( "Richq.jpg" "GateFS.jpg" ) <-- filelist gets this > (car (cadr (file-glob "*.jpg" 1))) Error: car: argument 1 must be: pair <-- (car filelist) fails
So file-glob is returning a list, but not directly of matching file names. Rather, it is a list containing a count and then another list, this one with the file names. But… I thought that lists were defined in Scheme by (item1 item2 …). What is this hash character? More groping around with Google, unproductive because the word “hash” generates too many false positives. Finally I got lucky with a site which talks about vectors, of whose existence I was unaware. The author cleverly hid from my searches by referring to # as a “mesh” character! Now we can see what is happening. file-glob returns a list; cadr returns the item which at the head of the rest of that list, namely the thing with the hash, and that’s not a list, but a vector. car functions expect a list, so the one inside the while loop fails as above.
So now I know why the script doesn’t work. But in another sense, I don’t know at all. Why does the flagship example of a batch processing script given on the GIMP’s own Web site not work? A little more research with Google reveals the truth: it did work on previous releases of the GIMP, but I’ve got version 2.4.0-rc3, which came with the distro I’m currently using. As is evident from the version, it is not a final release, but “release candidate 3”, and somebody did something to file-glob, so that it returns a vector instead of a list. Nobody updated the example batch script, or at least made a remark on the Web page to warn users, and my distro has this as the latest version in its repository. Thus the waste of a couple of days.
The procedure browser provided is a good idea, sadly let down by failure to carry things through. Some of the descriptions contain useful information, but there are others like that for gimp-floating-sel-rigor. Its action is described as “Rigor the floating selection”. Its two parameters are listed, then the Additional Information is “This procedure rigors the floating selection”. Aha, so that’s what it does! The “additional information” sections are often fatuous anyway. Here are a few examples (in their individual totality):
- More help here later
- FIXME: write help
- Yeah!
- Dodgeburn. More details here later.
- More help
- There's no help yet.
- No help yet. Just try it and you'll see!
- This function takes an image and blah blah. Hooray!
- Help? What help? Real men do not need help :-)
The Script-Fu Console has its own help button. It leads to a page in the bundled GIMP help file entitled "Appendix D. Eeek! There is missing help", and exhorting you to "feel free to join us and fill the gap by writing documentation for the GIMP." Right... feel free to wash up and put out the empties after we’ve drunk all the beer.
This is sadly typical of the Open Source world in my experience.
Faced with the above tribulations, the instinct is to start Googling. That leads mostly nowhere either. The problem is best illustrated with an example. Let’s pick a procedure at random, say gimp-layer-add-mask, (which is actually better documented than some), and search for occurrences on the Web. Running Google with just the name as above gives 517 results (20 Jan 2009).
- Scanning down these, you find the list stops at 73. The rest Google regards, no doubt reasonably, as duplicates. Oh well, let’s look at every one of these to see what there is.
- Sixteen are (more or less) cut-and-paste duplicates. That’s really not so bad… cutting and pasting like this is a major curse when looking for technical software information on the Net, and is significantly worse in other areas, spreading useless replications everywhere like weeds. We’re down to 57.
- Of these, 28 have file type ".scm". That means they’re just the bald scheme scripts using this procedure. No hope of extra information there. 29 left.
- And as for those, they are mostly just the scripts included in forum discussions or bug reports where this procedure happens to be used. Two give the C code for related user dialogue and a few are in foreign languages. NONE are devoted specifically to the procedure we want to know about.
Another thing which held me up was that the Nikon software includes two images inside each TIFF file, the real one and a thumbnail, and my script was by default returning me the thumbnail. By the time I had guessed how to specify the layer with the real image, I realised that the image-get-layers procedure was returning a vector too, although the procedure browser entry talks of a “list”. Perhaps every function which returns a vector once returned a list? No, a little research with the Script-fu Console in older GIMP versions shows that image-get-layers at least used to return something like (2 #(3 2)#2"0300"). Eh?? Oh hell, let’s not go there…
Anyway, you have somehow finally worked out enough about some procedures to compose a first script, so you run it.
In the normal course of events, it fails the first time. If you are a little lucky, the
message may tell you that, e.g., you tried to do something with a layer that doesn’t exist, and it may
helpfully suggest that, in this case, the reason could because the image was deleted, but most of the time
what you see is:
batch command: experienced an execution error.
Not a word about what happened, or where. There’s only one way to cope with this… creep up on your solution by
making the smallest changes possible between tests. That way you know exactly what little thing it was
that stopped your script working.
Of course, just understanding the little example given above doesn’t mean you are able to write Scheme. The lightweight introduction to writing scripts given in the GIMP online manual is nowhere near enough; string manipulation is completely absent, for example. I was trying to guess the names of functions like string-append until I stumbled on the MIT Scheme Reference. This references a much broader implementation of Scheme, but the string manipulation and conversion routines it lists are defined in the GIMP Scheme specification also. Once I had found this, the way was open for processing file names the way I needed to.
It would be tedious to list all the steps on the journey to a working script; a lot of trial and error was needed. I was struck by the number of times when I thought I understood a procedure, only to get an unexpected result when I used it. This wasn’t entirely due to native stupidity; many of the GIMP’s modes of operation in conventional interactive use are less than intuitive. For example, the operator actions required to delete a layer’s transparency channel are hardly obvious, and this translates directly into the odd-looking procedure calls used for the purpose in the script. In the end, it wasn’t cleverness that was needed to complete the job, but patience.
With the script at last polished off and comprehensively tested on the first version
of the GIMP, it’s time to try to make it useful to the maximum number of people by seeing if it works on some other
versions. Reboot into Ubuntu 8.10 with the significantly newer version 2.6.1, run the script on the same collection of test
images and…
batch command: experienced an execution error.
Groan. Are we going to start all over again? But luckily the first thing I try is to run file-glob in the GIMP
console, and bingo!, it’s once again returning a list. A quick adjustment to the script to cope with both cases
and, mirabile dictu, it works! A positive software experience for once.