|
You must unlearn what you have learned. -- Yoda |
Before we begin, it's important that you realize that we are not just teaching you a new language and a new way of thinking to torture you. Programming languages change and evolve at an astonishing rate; many of the most popular languages around now didn't even exist a few years ago. Programming paradigms change more slowly, but there are already many of these (imperative, functional, object-oriented, logic programming, constraint-based programming, stream-based programming) with more on the way (quantum computation, DNA computation, etc.). If all we taught you was how to program in one language we would be doing you a disservice. What is needed is for us to train you to think in a different and more flexible way so that learning new languages and new paradigms isn't difficult any more. That's what this course is about. It will probably be painful at first, but if you stick with it it will be rewarding.
One thing we will not to do on this page is to indulge in pointless discussion over whether Scheme or C is a "better" language; both languages have their place. Scheme does have major advantages as a language for teaching the fundamental ideas of programming, which is why we use it in CS 1. A lot of the programs that you have to write for CS 1 will either not be (directly) writable in C or else would be much more difficult to write in C.
Another thing we won't do on this page is to try and explain the underlying semantics of Scheme; we'll do that in class. Instead, this is a practical summary of the differences between C and Scheme. You should always be aware, though, that when we say "feature X in C is equivalent to feature Y in Scheme", this doesn't mean that they are semantically equivalent; it just means that they are used to achieve the same end.
Finally, if you don't understand something on this page, particularly regarding Scheme, don't panic; we've tried to be fairly comprehensive here and many of the aspects of Scheme we discuss below aren't covered in class until after the midterm. As always, you should ask your TA, your recitation instructor, or the course instructor if you have any questions.
Also, see the
for more information.
|
Syntactic sugar causes cancer of the semicolon. -- Alan Perlis |
| atoms | lists |
|---|---|
| 1 | (x y z) |
| 2.3 | (1 2 3 4 5) |
| x | ((1 2) (3 4) 5) |
Usually the only other syntactic form you will see in Scheme are comments, which start with a semicolon ";" and go to the end of the line:
| C | Scheme |
|---|---|
| /* This is a comment in C. */ | ; This is a comment in Scheme. |
The semicolon has no other function in Scheme. In particular, it does not serve as an end-of-statement marker. There is no multi-line comment operator in Scheme either; each comment line must have its own semicolon. It is also quite typical in Scheme to have a comment that starts with more than one semicolon; this is just to make the comment look nicer and has no other meaning:
| Scheme |
|---|
| ; This is a comment in Scheme. |
| ;; This is another comment in Scheme. |
! $ % & * + - . / : < = > ? @ ^ ~The most common of these characters to be used in identifiers are "?", "!", and "-". "?" is typically used in the names of functions that return a boolean (true/false) value (e.g. "even?"), "!" is typically used in the names of functions or special forms that change the values of variables (e.g. "set!") and hyphens are typically used in function names where C would use underscores (e.g. "calculate-result"). These are just conventions and are not enforced by the language. Also, note that the usual arithmetic operators (+, -, *, /) are just ordinary identifiers in Scheme. We'll talk more about them below.
Also, in Scheme, identifiers are not case-sensitive, so "foo" and "FOO" mean the same thing when used as e.g. a variable or function name. In C, identifiers are case-sensitive, so "foo" and "FOO" would represent two completely different names.
| C | Scheme |
|---|---|
| a + b | (+ a b) |
| (a - b) * 2 | (* (- a b) 2) |
This is quite unpleasant for most people at first, but it does have several advantages:
| C | Scheme |
|---|---|
| a * b - c | (- (* a b) c) |
| a * (b - c) | (* a (- b c)) |
In the C case, does "a * b - c" mean "(a * b) - c" or "a * (b - c)"? In fact, it means the former, but we need to know that the precedence of "*" is higher than that of "-" to figure this out. In the case of Scheme, the two possibilities are syntactically different so there is never any ambiguity. The price we pay for this is that parentheses are never optional in Scheme; every parenthesis has to be there, and additional parentheses change the meaning of an expression. For instance, "((foo x y))" is NOT the same as "(foo x y)"); the former expression means to take the result of evaluating the expression "(foo x y)" (which should be a function) and evaluate it again, while the latter expression means to just evaluate "(foo x y)". If this seems confusing, take our word for it; it'll become clearer as we progress in the course.
| C | Scheme |
|---|---|
| a + b + c + d + e | (+ a b c d e) |
This is often convenient.
One consequence of prefix syntax is that successive atoms MUST be separated by whitespace (space character(s), tab(s), or newline(s)), or else it would be impossible to distinguish the separate atoms:
| C | Scheme |
|---|---|
| a+b+c+d+e /* OK */ | (+ abcde) ; no good |
In the example above, there is no way to know that "abcde" refers to five different identifiers instead of one identifier called "abcde". You have to put the spaces between them (actually, this is good style in C as well). Similarly, in Scheme "a-b" refers to a single identifier and not to "a - b", because the minus sign can be included in identifiers.
Like C, Scheme is not too picky about true values; any value that is not a false value is considered to be true if it's used as a boolean.
| C | Scheme |
|---|---|
| 'a' 'b' 'c' '\n' ' ' | #\a #\b #\c #\newline #\space |
As you can see, some characters have mnemonic names e.g. #\newline.
| C | Scheme |
|---|---|
int x = 0; x = 1; x = "foo"; /* compiler complains */ |
(define x 0) ; no type specified (set! x 1) (set! x "foo") ; OK |
Note that many type errors in C don't result in errors but instead in compiler warnings (as in the above example). Thus, C's type checking is quite loose. We might say that C is only weakly strongly-typed ;-)
| C | Scheme |
|---|---|
#include <math.h> sqrt(10.0); |
(sqrt 10.0) |
These occur at the top level of the program (outside of function bodies).
| C | Scheme |
|---|---|
int i = 10; |
(define i 10) |
| C | Scheme |
|---|---|
{
int i = 10;
int j = 20;
/* more code */
}
|
(let ((i 10)
(j 20))
; more code
)
|
{
int i = 10;
int j = i;
/* more code */
}
|
(let* ((i 10)
(j i))
; more code
)
|
Note that if you have multiple local variable definitions where subsequent definitions depend on previous ones, you have to use "let*" instead of "let" in Scheme. In C the syntax is identical in both cases.
| C | Scheme |
|---|---|
i = 20; |
(set! i 20) |
Note that the normal style of Scheme programming (see below) makes most assignment statements unnecessary. Assignment statements are thus much rarer in well-written Scheme code compared to C code. In fact, for the first half of CS 1 you should pretend that assignment statements don't even exist.
Another thing to note is that the return value of an assignment expression in Scheme is unspecified, so you can't do the equivalent of "a = b = c = 0;" in C:
| C | Scheme |
|---|---|
a = b = c = 0; |
(set! a 0) (set! b 0) (set! c 0) |
| C | Scheme |
|---|---|
printf("hello, world\n");
j = 10;
|
(begin (display "hello, world") (newline) (set! j 10)) |
The return value of the "begin" statement is the value of the last statement evaluated. In this case, the last statement is a "set!" (which has an "unspecified return type" which means you shouldn't count on anything being returned), so you shouldn't use the return value for anything. In C, blocks of statements don't have a value, so the problem doesn't come up.
| C | Scheme |
|---|---|
foo(a, b, c) |
(foo a b c) |
| C | Scheme |
|---|---|
int foo(int x)
{
return x * 2;
}
|
(define (foo x) (* x 2)) |
int foo(int x)
{
int y = 10;
return x * y;
}
|
(define (foo x)
(let ((y 10))
(* x y)))
|
int foo(int x)
{
printf("x = %d\n", x);
return x * 2;
}
|
(define (foo x) (display "x = ") (display x) (newline) (* x 2)) |
Note that Scheme doesn't have a "return" statement. The last expression evaluated in the function body is the value of the function.
| C | Scheme |
|---|---|
if (a > 0)
{
printf("Woo hoo!\n");
a = 1;
}
else
{
printf("Aw shucks!\n");
a = -1;
}
|
(if (> a 0)
(begin
(display "Woo hoo!")
(newline)
(set! a 1))
(begin
(display "Aw shucks!")
(newline)
(set! a -1)))
|
(a > 0) ? 1 : -1 |
(if (> a 0) 1 -1) |
if (a > 0)
{
printf("greater than 0\n");
}
else if (a < 0)
{
printf("less than 0\n");
}
else
{
printf("equal to zero\n");
}
|
(cond ((> a 0)
(display "greater than 0")
(newline))
((< a 0)
(display "less than 0")
(newline))
(else
(display "equal to zero")
(newline)))
|
Note that after each of the tests in a "cond" statement there is an implicit "begin" i.e. you can put several statements before the closing parenthesis and they will all be executed sequentially. If you want to do this with an "if" statement you have to put the "begin" in explicitly.
| C | Scheme |
|---|---|
for (i = 0; i < 10; i++)
{
printf("i = %d\n", i);
}
|
(do ((i 0 (+ i 1)))
((= i 10) #f) ; return #f at the end
(display "i = ")
(display i)
(newline))
|
Explicit iteration statements using "do" are extremely rare in well-written Scheme code; instead, the normal style is to use recursion for iteration (see below).
Some operators are different in Scheme and C.
| C | Scheme |
|---|---|
== |
= |
! |
not |
a != b |
(not (= a b)) |
a++; |
(set! a (+ a 1)) |
a += 10; |
(set! a (+ a 10)) |
(and similarly for "--", "-=", "*=" etc.).
Arrays are called "vectors" in Scheme.
| C | Scheme |
|---|---|
int i[] = { 1, 2, 3, 4, 5 };
|
(define i (vector 1 2 3 4 5)) (define i #(1 2 3 4 5)) ; equivalent |
i[2] |
(vector-ref i 2) |
i[2] = 10; |
(vector-set! i 2 10) |
The Scheme syntax for vectors is admittedly quite verbose.
Scheme includes several language features that are not found in C:
Probably the most profoundly different feature of Scheme relative to C is that functions are "first-class" data in Scheme but not in C. What "first-class" means is that in Scheme, functions are data just like numbers or strings. Functions can be passed as arguments to other functions, they can be returned as the result of a function, and they can be created inside a function. Here is an example that shows this:
| C | Scheme |
|---|---|
| No equivalent! |
(define (adder n)
(lambda (x) (+ x n)))
|
What the "adder" function does is to create a new function of one argument that adds "n" to the argument. It's used like this:
(define add2 (adder 2))
(add2 5) ; result: 7
This ability to create functions on-the-fly is very powerful, and we will be
exploring this in detail in CS 1. The mysterious word "lambda" basically
just means "make a function", so
(lambda (x) (+ x n))
means "make a function of one argument called x which adds x to n and returns
the result". It turns out that all function definitions in Scheme are
actually lambda expressions, so that
(define (f x) (* x 2))
is just a syntactic shorthand for
(define f (lambda (x) (* x 2)))
The word "lambda" comes from the lambda calculus, which is an abstract model
of computation which furnishes the theoretical foundations for the Scheme
programming language. But that's another story. You should think of a
lambda expression as a function without a name; the "define" expression gives
it a name (it binds the name to the function). C does not have
anonymous functions.
Functions which accept functions as arguments are known as higher-order functions. They have many uses; here is a trivial one:
(map (lambda (x) (* x 2)) (list 1 2 3 4 5)) ; result: (2 4 6 8 10)
Here, "map" takes a function as its argument and "maps" the function over
each element of a list. The result is a list where each element is twice as
big as the corresponding element of the input list. You'll see many more
examples of higher-order functions and their uses in the course (and in SICP, the course text).
It should be noted that C contains a very primitive version of this
capability: function pointers. Function pointers allow a C programmer to
pass a function as an argument to another function. However, it isn't
possible to define a function inside the body of another function in C (at
least not in standard C; gcc does allow this), so this limits
the usefulness of function pointers considerably.
There are a variety of ways of defining functions inside functions in Scheme. The simplest way is to use an internal "define":
(define (foo x)
(define (bar y) (* y 2))
(* (bar x) 3))
This is a totally trivial example, but nested function definitions can be
quite useful, as you'll see in the course.
As mentioned above, lists (more specifically singly-linked lists) are a fundamental data type in Scheme but not in C.
Scheme has a special data type for symbols and symbolic expressions. They are created using the "quote" special form:
(define a (quote b))
(define a 'b) ; the same thing
The single-quote character is syntactic sugar for the "quote" expression, so
"'b" is the same as "(quote b)". But what is "(quote b)", anyway?
It's just the symbol "b", unevaluated. Basically, what "quote" does is to
switch off the Scheme evaluator. If there was no "quote", then Scheme would
try to evaluate "b" in the current environment. We call "(quote b)" a symbol
(in this case, the symbol "b"). You can also quote whole expressions e.g.
(quote (1 2 3 4 5))
'(1 2 3 4 5) ; the same thing
(list 1 2 3 4 5) ; equivalent
In this case, "'(1 2 3 4 5)" is just the literal list "(1 2 3 4 5)", which is
also the result of evaluating the expression "(list 1 2 3 4 5)". This is a
convenient shorthand for creating literal lists. There are lots of other
uses for quoted expressions, but they're beyond the scope of the course.
Continuations are pretty advanced, and you don't have to know about them for CS 1. On the positive side, continuations can be used to implement arbitrarily complex control structures. On the negative side, too much programming with continuations can make your head explode ;-) Scheme supports continuations as first-class objects. If you're curious, see an advanced Scheme textbook or do an internet search.
Scheme does not permit direct manipulation of pointers (variables that hold memory addresses) or pointer arithmetic the way that C does. However, in some sense every value in Scheme is a pointer to an object, so this doesn't cause problems in practice (unless you need direct access to the machine's memory for some reason, in which case Scheme is not the right language for the job). Pointers are a notorious source of bugs in C; not having them in Scheme makes a programmer's life much more pleasant.
Standard Scheme does not have a struct-like construct. However, virtually all Scheme dialects do have such a construct (for instance, DrScheme does). It is quite easy to roll your own struct-like construct in Scheme with a little effort.
You can achieve the same effect as a switch statement in C using a cond statement:
| C | Scheme |
|---|---|
switch (a)
{
case 0:
printf("zero\n");
break;
case 1:
printf("one\n");
break;
default:
printf("whatever\n");
break;
}
|
(cond ((= a 0)
(display "zero")
(newline))
((= a 1)
(display "one")
(newline))
(else
(display "whatever")
(newline)))
|
There are no bitwise operators in standard Scheme. C is a better language than Scheme for low-level bit manipulation.
There are no unsigned integer types in Scheme.
Memory allocated in a Scheme program does not have to be explicitly freed, because the Scheme run-time system does that for you. This is called garbage collection, and it makes programming much less of a hassle than in C, where you have to explicitly free all memory that you allocate, or else you are likely to run out of memory.
In C, if you try to access the 100th element of a 10-element array, anything can happen. You might get a core dump (i.e. your program/OS may crash), or your program may continue and then mysteriously fail at a later time. In Scheme, an error is reported immediately.
| C | Scheme |
|---|---|
void bad()
{
char s[] = "foo";
int result;
result = 1 + s; /* generates a compiler warning */
}
|
;; This is perfectly legal Scheme: (define (bad) (+ 1 "foo")) ;; We get an error when we call the function: (bad) |
Note, however, that even flagrant type errors like trying to add a string to an integer only result in compiler warnings, not outright errors. The type checking in C is actually quite loose (the same code would give a compiler error in C++), so the type checking in C doesn't help as much as you would think.
|
You have taken your first step into a larger world. -- Obi-wan Kenobe |
Scheme encourages a very different style of programming than C does, and this is a major obstacle for beginning Scheme programmers who are accomplished C programmers. Suffice it to say that if you think that all programming languages are basically the same, you are in for a very unpleasant shock. The style of programming that Scheme encourages is usually known as the functional programming style. This has a number of features:
Having said that, it's important to realize that it is possible to write code in Scheme in a completely imperative style as well. One of the big advantages of Scheme is that you can use multiple different programming styles (whichever one suits the problem best). In contrast, it is very difficult to program in C in anything other than a purely imperative style.
gcc supports a large number of other
extensions as well), but basically, C is the same language everywhere. The
situation is different with Scheme. There is a fundamental core of
functionality that a Scheme implementation has to support in order for it to
call itself Scheme; this is the Scheme standard. The current standard is
called "R5RS"; it's the fifth revision of the Scheme standard. However,
there are a lot of fundamental features that are not covered by the standard.
They include things like Scheme's equivalent to C's structs, module systems,
exception handling systems, and object-oriented extensions to Scheme.
Because Scheme is so flexible, many such extensions exist. Unfortunately,
almost all of them are incompatible with one another. This is not a problem
for our purposes because we will only be needing the core R5RS Scheme for the
course (and not even all of that). However, you should be aware that each
particular implementation of Scheme is a dialect with a common core.