CS 1 Fall 2008

Big Idea

Quoting and Tagged Data

Monday, November 3, 2008


Up until now, the data that we have been using has always contained some kind of literal data, generally numbers. It is often desirable to be able to use symbolic data directly. Normally, when we type in a symbolic name (e.g. x) into the Scheme interpreter, it will look up the value of that name. But sometimes we just want to use the name itself (the symbol x in this case), without looking up a value. For instance, we might want to put a tag on a data structure which is meaningful to humans. Or at a much greater level of sophistication, we might want to treat Scheme code itself as if it were data. Scheme's quote special form enables us to do this, opening up a large number of powerful programming strategies that would not be possible without it.

One thing that quote allows us to do is to create numerical data which is tagged with the units of that data. In science, you learn to be careful with your units. You don't add feet to meters or force to momentum. The quantities you deal with have dimensions, and you must carry these through your equations to avoid mistakes. Even though computers and calculators have scalar quantities such as raw numbers as native data types, that doesn't stop you from defining richer data types which capture units or dimensionality. As in your physical calculations, this can help catch errors which might otherwise slip by. Since we can use the computer to automate information processing, things with obvious conversions (e.g., converting feet to meters) can be automated so that the computer takes care of the bookkeeping for us.

In general, tagged data (data that includes a symbolic tag identifying its type) allows us to handle different kinds of data appropriately. We can be deliberate about identifying what a data item is, and we can write our programs to handle the different kinds of data it may encounter appropriately, including helping us catch errors when we mistakenly use a data item in the wrong way. This gives us great power in dealing with rich sets of data types and representations, and it gives us a strong defense against data misinterpretation.