Lab 8B: carlae Part II

The questions below are due on Monday November 20, 2017; 10:00:00 PM.


You are not logged in.

If you are a current student, please Log In for full access to the web site.
Note that this link will take you to an external site (https://oidc.mit.edu) to authenticate, and then you will be redirected back to this page.

Table of Contents

1) Timing

This lab is part 2 of a two-week lab, which will be completed next week. Due dates for this part are:

  • Code submission and concept questions: Monday, 20 November, 10pm
  • Checkoff: Wednesday, 29 November, 5pm

2) Preparation

This lab assumes you have Python 3.5 or later installed on your machine.

The following file contains code and other resources as a starting point for this lab: lab8B.zip

You should start by copying your lab.py file from lab 8A into this week's distribution. Most of your changes will be made to this file.

You may submit portions of the lab late (see the grading page for more details), but the last day to submit this lab will be the Friday after the due date.

This lab is worth a total of 4 points. Your score for the lab is based on:

  • passing the test cases from test.py under the time limit (1.5 points), and
  • a brief "checkoff" conversation with a staff member to discuss your code (2.5 points).

Please also review the collaboration policy before continuing.

3) Introduction

In lab 8A, you implemented an interpreter for a dialect of LISP called carlae. This lab builds on your work from lab 8A to introduce some new features into the carlae language. If you have not yet finished lab 8A, you should do so before working on this lab.

Your code should pass the Test1_OldTests and Test2_NewTestsForOldBehaviors test suites with no modification to your lab 8A file.

3.1) Conditionals

Now we'll add support for conditional execution via the if special form, which has the following form: (if COND TRUEEXP FALSEEXP)

To evaluate this form, we need to first evaluate COND. If COND evaluates to true, the result of this expression is the result of evaluating TRUEEXP; if COND instead evaluates to false, the result of this expression is the result of evaluating FALSEEXP. Note that we should never need to evaluate both TRUEEXP and FALSEEXP when evaluating an if expression (for this reason, we cannot implement if as a function; it must be a special form).

3.2) Booleans and Comparisons

In order to implement if, we will need a way to represent Boolean values in carlae. This decision is up to you, but no matter your choice of representation, you should make these values available inside of carlae as literals #t and #f, respectively. We will also need several additional functions, all of which should take arbitrarily many arguments:

  • =? should evaluate to true if all of its arguments are equal to each other.
  • > should evaluate to true if its arguments are in decreasing order.
  • >= should evaluate to true if its arguments are in nonincreasing order.
  • < should evaluate to true if its arguments are in increasing order.
  • <= should evaluate to true if its arguments are in nondecreasing order.

As well as the following Boolean combinators:

  • and should be a special form that takes arbitrarily many arguments and evaluates to true if all of its arguments are true. It should only evaluate the arguments it needs to evaluate to determine the result of the expression. For example, (and (> 3 2) (< 7 8) #f) should evaluate to false.
  • or should be a special form that takes arbitrarily many arguments and evaluates to true if any of its arguments is true. It should only evaluate the arguments it needs to evaluate to determine the result of the expression. For example, (or (> 3 2) #t (< 4 3)) should evaluate to true.
  • not should be a function that takes a single argument and should evaluate to false if its argument is true, and true if its argument is false. For example, (not (=? 2 3)) should evaluate to true.

In carlae (as in Python), and and or do not behave like functions, in the sense that they do not necessarily evaluate all their arguments; they only evaluate as far as they need to to figure out what their result should be.

Check Yourself:

and evaluates to true only if all of its arguments are true. So if we're evaluating all its arguments in order one-by-one, under what condition can we stop (and avoid evaluating the rest of the arguments)? What about or?

After implementing these functions / special forms, modify your evaluate function so that it properly handles the if special form. Once you have done so, your code should pass the Test3_Conditionals test suite in test.py.

With this addition, your interpreter should be able to handle recursion! Try running the following pieces of code from your REPL to check that this is working:

in> (define (factorial n) (if (<= n 1) 1 (* n (factorial (- n 1)))))
in> (factorial 6)

And feel free to play around and try some programs of your own!

3.3) Lists

Next, let's add support for lists. Add a new function called list to the built-in functions. list should take arbitrarily many arguments, and it should return a list containing those elements. In carlae, we will implement lists as linked lists (as is typical in many LISP dialects). The implementation is discussed below:

You should implement your linked list representation as a class called LinkedList. Each instance of LinkedList should have exactly two instance variables:

  • an attribute called elt (the element at this position in the list), and
  • an attribute called next, which points to an instance of LinkedList representing the next element in the list, or None if this instance represents the end of the list.

3.4) Built-in List Functions

Add a list function to carlae; this function should take one or more arguments and should construct a LinkedList instance that contains those arguments, in order. You should make sure that calling list with no arguments produces a representation for an empty list (you may take some inspiration from the fact that the next attribute of a length-1 list is None).

In addition, we will define some additional built-in functions for operating on lists within carlae.

All of the functions below should be implemented by operating directly on instances of LinkedList, without ever converting to a Python list/tuple.

car and cdr

These functions, named for historical reasons, return the "head" and the "tail" of a linked list, respectively.

  • (car LIST) should take a list as argument and should return the first element in that list. If it is called on an empty list, it should raise an EvaluationError.
  • (cdr LIST) should take a list as argument and return the list containing all but the first element in that list. If it is called on an empty list, it should raise an EvaluationError.

Convenience Methods

In addition, we will add a few functions for convenience:

  • (length LIST) should take a list as argument and should return the length of that list.
  • (elt-at-index LIST INDEX) should take a list and an index, and it should return the element at the given index in the given list (as in Python, indices start from 0).
  • (concat LIST1 LIST2 LIST3 ...) should take an arbitrary number of lists as arguments and should return a new list representing the concatenation of these lists. If exactly one list is passed in, it should return a copy of that list. If concat is called with no arguments, it should produce an empty list.

map, filter, and reduce

Beyond these functions, the following will allow us to easily construct new lists from existing ones:

  • (map FUNCTION LIST) takes a function and a list as arguments, and it returns a new list containing the results of applying the given function to each element of the given list.

    For example, (map (lambda (x) (* 2 x)) (list 1 2 3)) should produce the list (2 4 6).

  • (filter FUNCTION LIST) takes a function and a list as arguments, and it returns a new list containing only the elements of the given list for which the given function returns true.

    For example, (filter (lambda (x) (> x 0)) (list -1 2 -3 4)) should produce the list (2 4).

  • (reduce FUNCTION LIST INITVAL) takes a function, a list, and an initial value as inputs. It produces its output by successively applying the given function to the elements in the list, maintaining an intermediate result along the way. This is perhaps the most difficult of the three functions to understand, but it is perhaps easiest to see by example.

    Consider (reduce * (list 9 8 7) 1). The function in question is *. Our initial value is 1. We take this value and combine it with the first element in the list using the given function, giving us (* 1 9) or 9. Then we take this result and combine it with the next element in the list using the given function, giving us (* 9 8) or 72. Then we take this result and combine it with the next element in the list using the given function, giving us (* 72 7) or 504. Since we have reached the end of the list, this is our final return value (if there were more elements in the list, we would keep combining our "result so far" with the next element in the list, using the given function).

The Wikipedia pages for map, filter, and reduce provide some additional examples.

Once we have these three functions, we have the equivalent of list comprehensions in carlae! In Python, for example, we might write:

sum([i**2 for i in some_list if i < 0)

In carlae, we can now do the same thing with the following code:

(reduce + (map (lambda (i) (* i i)) (filter (lambda (i) (< i 0)) some_list)) 0)

This is a lot to take in, but it gives us the same result:

  • It first filters some_list to produce a list containing only the negative values.
  • It then maps the "square" function onto the resulting list.
  • Finally, it reduces that result by successive application of the + operator to produce the sum.

Implementation

Implement the list, car, cdr, length, elt-at-index, concat, map, filter, and reduce functions and add them to the built-in functions. Once you have done so, your code should pass the Test4_Lists suite in test.py. REMINDER that these functions should not operate by converting to Python lists/tuples.

4) Reading From Files

So far, when we have wanted to define new functions, we have had to type them (or copy/paste them) into the REPL every time we run the program. In this section, we will add the capability to run the contents of a file before dropping into our REPL, which we can use, for example, to define multiple functions.

Define a function called evaluate_file in lab.py. This function should take a single argument (a string containing the name of a file to be evaluated) and an optional argument (the environment in which to evaluate the expression), and it should return the result of evaluating the expression contained in the file.

You may find the documentation for Python's built-in open function helpful.

5) Command Line Arguments

Now that we have the ability to evaluate the contents of a file in a particular environment, we will need to let Python know which files it should evaluate before dropping into the REPL. We will do this by passing the names of these files to carlae as command line arguments. For example, instead of just running:

$ python3 lab.py

we will run something like:

$ python3 lab.py some_definitions.crl more_definitions.crl

From inside of Python, these arguments are available as part of the sys.argv list. For the example above, sys.argv will contain:

['lab.py', 'some_definitions.crl', 'more_definitions.crl']

Modify lab.py so that, when lab.py is run with filenames passed in on the command line, it evaluates the expressions contained in those files into the global environment before entering the REPL. You may assume that each file contains only one expression. To test your code, you can make a few files that contain simple expressions you can check (for example, define a single variable inside a file and make sure it is available to you from the REPL after that file is evaluated).

6) Evaluating Multiple Expressions

To help with the above, introduce a new built-in function called begin. begin should simply return its last argument. This is a useful structure for running commands successively: even though only the last argument is returned, all of the arguments are evaluated in turn, which allows us to run arbitrary expressions sequentially.

For example, (begin (define x 7) (define y 8) (- x y)) should evaluate to -1.

After you have implemented begin, you should be able to run python3 lab.py test_files/definitions.crl to grab some definitions into the REPL.

After implementing begin, your code should pass tests the Test6_Files test suite in test.py.

7) Variable Binding Manipulation

We will finish by implementing two additional special forms, which can be used to manipulate variable bindings: let and set!.

7.1) let

let is used for creating local variable definitions. It takes the form: (let ((VAR1 VAL1) (VAR2 VAL2) (VAR3 VAL3) ...) BODY), where VAR1, VAR2, etc., are variable names, and VAL1, VAL2, etc., are expressions denoting the values to which those names should be bound. It works by:

  • Evaluating all the given values in the current environment
  • Creating a new environment whose parent is the current environment, and binding each name to its associated value in this new environment.
  • Evaluating the BODY expression in this new environment (this value is the result of evaluating the let special form).

Note that the given bindings are only available in the body of the let expression. For example:

in> (define z 5)
    out> 5

in> (let ((x 5) (y 3)) (+ x y z))
    out> 13

in> x
    EXCEPTION!

in> y
    EXCEPTION!

in> z
    out> 5

7.2) set!

set! (often pronounced "set-bang", or just "set") is used for changing the value of an existing variable. It takes the form: (set! VAR EXPR), where VAR is a variable name, and EXPR is an expression.

It should work by:

  • Evaluating the given expression in the current environment
  • Finding the nearest enclosing environment in which VAR is defined (starting from the current environment and working upward until it finds a binding), and updating its binding in that environment to be the result of evaluating EXPR

It should also evaluate to that same value.

If VAR is not defined in any environments in the chain, set! should raise an EvaluationError.

in> (define x 7)
    out> 7

in> (define (foo z) (set! x (+ z 2)))
    out> function object

in> (foo 3)
    out> 5

in> x
    out> 5

in> (define (bar z) (define x (+ z 2)))
    out> function object

in> (bar 7)
    out> 9

in> x
    out> 5

Implement let and set! in lab.py. After doing so, your code should pass all the tests in test.py! Note that the last 4 test cases are realistic programs implemented in carlae!

8) Notes and Optional Improvements

Congratulations; you've just implemented your first interpreter! By now your carlae interpreter is capable of evaluating arbitrarily complicated programs!

Hopefully this has been fun, interesting, and educational in its own right, but there are a few important reasons why we've chosen this as a project:

  1. There is something powerful in understanding that an interpreter for a programming language (even one as complicated as Python) is just another computer program, and it is something that, with time and effort, you are capable of writing.

  2. This is an example of a rather large and complicated program, but we were able to manage that complexity by breaking things down into small pieces.

  3. We wanted to give you an opportunity to make some more open-ended design decisions than you have been asked to make in the past, and this lab offers such an opportunity.

  4. Our little carlae interpreter actually has a lot in common with the Python interpreter, and so there is a hope that you have learned something not only about this little language, but also about how Python behaves. Among other things:

    • both run programs by breaking the process down into lexing, parsing, and evaluating, and
    • the way function calls are scoped and handled is very similar in the two languages.
  5. Course 6 and LISP have a long history:

    • LISP was conceived by John McCarthy at MIT in 1958, and one of the most widely used LISP dialects, Scheme, was developed here by Guy Steele and Gerry Sussman in 1970.
    • The predecessor to 6.009, 6.001 Structure and Interpretation of Computer Programs, was taught as part of the course 6 introductory series for around 30 years and used Scheme as its primary language. The associated text is still considered by many to be one of the best books ever written about computer programming.

Now that we have a working interpreter, section 11 includes a few suggestions of neat (but optional) additional features that you might consider adding to your interpreter. Although these are not required, they would make great extra practice if you're looking for it!

9) Code Submission

 No file selected

10) Checkoff

Once you are finished with the code, please come to a tutorial, lab session, or office hour and add yourself to the queue asking for a checkoff. You must be ready to discuss your code and test cases in detail before asking for a checkoff.

You should be prepared to demonstrate your code (which should be well-commented, should avoid repetition, and should make good use of helper functions). In particular, be prepared to discuss:

  • any changes you made to your code from lab 8A (if any)
  • your internal representation of #t and #f
  • how you implemented short circuiting in and and or
  • your implementation of LinkedList and the associated functions
  • how you represented empty lists
  • your implementation of let and set!
  • demonstrate your REPL working after loading a file

10.1) Grade

You have not yet received this checkoff. When you have completed this checkoff, you will see a grade here.

11) Optional Improvements

11.1) Tail-Call Optimization

If you have the time and interest, a really interesting and powerful optimization for our interpreter comes in the form of tail-call optimization.

A typical way of structuring the evaluator from above involves making recursive calls to evaluate, in particular when calling functions. This is a nice way of structuring things, but it actually leads to issues. For example, try calling (fib 2000) from your REPL after loading in the definitions from resources/definitions.crl. What happens?

We run into issues with recursive calls because Python (necessarily) has a limit on the "depth" it will allow in a recursive call, to prevent infinite recursion. Even if Python didn't have this limit, it would end up using quite a lot of memory allocating new stack frames for these recursive calls.

A neat optimization to avoid this problem is to implement tail-call optimization, whereby we can avoid some of these issues (allowing, for example, computing (fib n) for arbitrarily large n).

In short, many pieces of our interpreter involve returning the result of a single new call to evaluate, with a different expression and a different environment. In those situations (conditionals, calling user-defined functions, etc.), it would be much better from an efficiency perspective to avoid the recursive call to evaluate; rather, we can simply adjust the values of the expression and the environment within the same call to evaluate by introducing a looping structure: keep looping until we have successfully evaluated a structure, and if the result is simply the result of evaluating another expression (potentially in a different environment), then adjust those values as necessary.

There are no tests for this optimization, but after doing so, your code should work for (fib 100000) (or arbitrarily high n)!

11.2) Additional (Optional) Improvements and Exercises

If you have the time and interest, you can improve upon this base language by implementing some additional features. Below are some ideas for possible improvements, or just for ways to get extra practice. These are by no means required, but we are still happy to help with them if you get stuck!

  • Add a function called set-car! that changes the first element of a linked list to a particular value. Then use your list built-ins to implement a set-elt-at-index function in carlae.
  • Add support for the quote and unquote special forms (search for "special form: quote" on this page).
  • Implement strings as an additional data type (be careful of how you handle comments to make sure that a ; within a string doesn't get treated as the start of a comment!).
  • Allow the body of a function defined with lambda, or the body of a let expression, to consist of more than one expression (this should be implicitly translated to a begin expression).
  • Improve the system's error reporting by providing meaningful error messages that describe what error occurred.
  • Since iteration doesn't exist in carlae (except via recursion), try implementing some simple programs in carlae for extra practice with recursion!
  • Add support for importing functions and constants from Python modules (by mapping carlae functions to the the __import__ and getattr Python functions) so that the following will work:
in> (define math (py-import (quote math)))
in> (define sqrt (getattr math (quote sqrt)))
in> (sqrt 2)
    out> 1.4142135623730951