When I left Yosemite, I moved to Ann Arbor, Michigan, where I began graduate school in Information Science. I worked in the university herbarium on a lichen digitization project, which required me to run, and sometimes modify, a perl script heavily peppered with regular expressions.

Regular expressions are extremely powerful, symbolically-loaded expressions. They embed complex representations and operators into single-character symbols. 

These symbolic accretions are linked to perform find and replace operations on text. For example this expression will return every valid phone number when it is fed lines of text.

if($string =~ m/[\)\s\-]\d{3}-\d{4}[\s\.\,\?]/){print "$string\n"};

The lichen specimens were inside folded paper packets, onto which labels were affixed describing the locations, dates, and parties involved in the specimens' collection. We photographed these labels, and then ran the script, which performed OCR (optical character recognition) on the images, returning strings of text that were used to populate file names and metadata. 

Though I am not certain, it is likely that the script used the perl module Naive, which is the perl image-to-text implementation of Ray Kurzweil's omni-font OCR. 

Slowly, we built digital representations of the lichen specimens' metadata. I was supposed to only photograph the labels, but I was unsupervised, and I often opened packets to view and photograph the lichen inside. If I accidentally left these images in the batch of photographs to be processed, the script would attempt to parse the lichen as text.

Some of the lichen specimens were over 100 years old. Many were perfectly preserved, and, one could imagine, still slowly growing. It takes fifteen years for Letharia Vulpina, wolf lichen, to sprout one tiny branch. If it is still alive, how many years will pass before it breaks out of its packet? 

Letharia Vulpina

The packets also contained the substrates to which the lichens were attached at the time of harvest: rocks, bricks, tree bark, and, once, bone. 

Both the lichen and the substrates are subject to decay, and in some cases, I would open a packet to find only dust — what was lichen and what was substrate indistinguishable to the eye. 

If I processed an image of this dust, I was no more successful at divining text than when I processed an image of intact lichen. When the script failed, all of the metadata fields remained empty: data out of place.