CS 334: Principles of Programming Languages

CS 334: Lab 7: Scala Basics

Overview
Partner
Getting Started
Programming
Submitting Your Work

Overview

In this lab, you will get hands-on experience with Scala programming. You will work with file I/O, collections, classes, pattern matching, and algebraic data types through three progressively more involved problems.

Partner

You are encouraged to work with a partner on this lab. As always, please send email if you would like help finding a partner.

Getting Started

Setting Up Your Repository

You will receive an email with an invitation link to the lab7 assignment on GitHub Classroom. You can follow the same instructions as on Lab 2 for accessing and cloning your repository. See the GitHub reference for instructions to add a partner. You should answer the following in the appropriate files in your repository.

Scala Setup

This lab targets Scala 2.13. See instructions here for setting up Scala.

The scala command will give you a "read-eval-print" loop, as in Lisp and ML.

You can also compile and run a whole file as follows. Suppose file A.scala contains:

object A {
    def main(args : Array[String]) : Unit = {
        println(args(0));
    }
}

You then compile the program with scalac A.scala, and run it (and provide command-line arguments) with "scala A moo cow".

Resources

This lab targets Scala 2.13. Use the 2.13-specific references below (the latest Scala 3 docs have diverged in important ways):

Programming

1. The Happy Herd (15 points)

In this first question we'll use Scala answer a few questions about cows. Specifically the herd at Cricket Creek Farm...

First, write a program to read in and print out the data in the file "cows.txt" in your repository. Each line contains the id, name, and daily milk production of a cow from the herd. (I've also included a "cows-short.txt" file that may be useful while debugging.)

Your repository contains a skeleton "Cows.scala" with an object definition and a main method. Recall that objects are like classes, except that only a single instance is created. The skeleton looks something like:
```
object Cows {
    def main(args : Array[String]) : Unit = {
        // ...
    }
}
```
One useful snippet of code is the following line.
```
val lines = scala.io.Source.fromFile("cows.txt").getLines();
```
We will use this to read the file. Try this out in the Scala interpreter. What type does lines have? For convenience in subsequent processing, it will be useful to convert lines into a list:
```
val data = lines.toList;
```
Print out the list and verify you are successfully reading all the data. Use a for loop. For loops in Scala follow a familiar syntax:
```
scala> for (i <- 1 to 3) println(i);
1
2
3
```

S-cows. A for-comprehension lets you iterate over a collection, optionally with one or more if filters. For example:

scala> for (i <- 1 to 3) println(i)
1
2
3

scala> for (i <- 1 to 5 if i%2 == 0) println(i)
2
4

Multiple if clauses chain together — the element is kept only if all the conditions hold:

scala> for (i <- 1 to 10 if i%2 == 0 if i%3 == 0) println(i)
6

Using a for-comprehension with chained if filters, print all cows whose name contains "s" (case-insensitive) but not "h". Scala Strings support the usual Java String operations; a few useful ones here and below:

class String {
    def contains(str : String) : Boolean
    def startsWith(str : String ) : Boolean
    def toLowerCase() : String
    def toUpperCase() : String

    // split breaks up a line into pieces separated by separator.
    //   For ex:   "A,B,C".split(",")  ->  ["A", "B", "C"]
    def split(separator : String) : Array[String] 
}

SHOUTING cows. Scala supports list comprehensions with yield, which build a new list rather than just iterating for side effects:
```
val doubled = for (x <- list) yield 2 * x;
val evens   = for (x <- list if x%2 == 0) yield x;
```
Use a for ... yield expression to build a new list containing each cow's name in UPPER CASE. Store the resulting list in a variable, then print it.
Cow Objects. Next, define a new class in "Cows.scala" to store one cow, its id, and its daily milk production.
```
class Cow(s : String) {
    def id : Int = ... 
    def name : String = ... 
    def milk : Double = ...
    override def toString() = { 
        ...
    }
}
```
It takes in a string of the form "id,name,milk" from the data file and provides the three functions shown. For toString, you may find formatting code like ""%5d ".format(n)" handy -- it formats the number $n$ as a string and pads it to 5 characters.

Use a map operation on data to convert it from a list of strings to a list of Cows (e.g., val herd = data.map(...)). Print the result and make sure it works.
Milk-o-meter. Use a for loop over your herd to print an ASCII bar chart, one line per cow, where the bar length is proportional to daily milk production. The bar itself is just "*" * (cow.milk * 2).toInt (in Scala, "x" * n repeats a string n times).

To make the bars line up, use Scala's printf-style format method on strings, e.g. "%s is %d".format(name, n). See the Java Formatter docs for width and alignment specifiers.

Output should look something like:
```
Ava       : *******
Shirley   : *********
Cashew    : ******
Carmen    : ******
Shiloh    : ********
```

Sorting Cows. Use the sortWith method on Lists to sort the cows by id. Also use foldLeft to tally up the milk production for the whole herd.

class List[A] {
    def sortWith (lt: (A, A) => Boolean) : List[A]
    def foldLeft [B] (z: B)(f: (B, A) => B) : B
}

Note that foldLeft is a polymorphic method with type parameter B. Also, foldLeft is curried, so you must call it specially, as in:

val list : List[Int] = ...;
val n : Int = ...;
list.foldLeft (n) ( (x: Int, elem: Int) => ... )

Best and Worst Milkers. Finally, use the maxBy and minBy methods on your list of cows to find the cows with the highest and lowest daily milk production.

2. Ahoy. World! (15 points)

You'll now learn to speak like a pirate, with the help of Scala maps and a Translator class... The program will take in an English sentence and convert it into pirate. For example, typing in

pardon, where is the pub?

gives you

avast, whar be th' Skull & Scuppers?

Your repository contains a "Pirate.scala" file to start with. You will be responsible for implementing a Translator class, reading in the pirate dictionary, and processing the user input. It will be easiest to proceed in the following steps:

First, complete the Translator class. It has the following signature:
```
class Translator {
    // Add the given translation from an English word
    //    to a pirate word
    def += (english : String, pirate : String) : Unit

    // Look up the given english word.  If it is in the dictionary, 
    //   return the pirate equivalent.  
    // Otherwise, just return English.  
    def apply(english : String) : String

    // Print the dictionary to stdout
    override def toString() : String
}
```
Note that we're overloading the += and () operators for Translator. Thus, you use a Translator object as follows:
```
val pirate = new Translator();
pirate += ("hello", "ahoy");
..
val s = pirate("hello");
```
If "hello" is in the dictionary, its pirate-translation is returned. Otherwise, your translator should return the word passed in. Any non-word should also just be returned. Thus:
```
pirate("hello")   ==>  "ahoy"
pirate("moo")     ==>  "moo"
pirate(".")       ==>  "."
```
When writing apply, use the get method on map and pattern matching to handle the Option type it returns. (See class notes / tutorial for details on Option.)

Finish the definition of Translator using a Scala map instance variable. To write toString, you may find it handy to look at the mkString methods of the Scala Map classes.

Add a few lines to the Pirate main method to test your translator.
Now, read in the full pirate dictionary from the "pirate.txt" data file, and print out the resulting translator.

Once you have the translator built, uncomment the lines in main that process standard input, and process the text the user types in. There are a few sample sentences in your repository. Here is an example:

Stephen-Freund:~/scala] cat sentence1.txt 
pardon, where is the pub?
I'm off to the old buried treasure.

Stephen-Freund:~/scala] scala Pirate < sentence1.txt
avast, whar be th' Skull & Scuppers?
I'm off to th' barnacle-covered buried treasure.

Note: You may get some deprecation warnings to the effect of "multiarg infix syntax looks like a tuple and will be deprecated". Typically I'd make sure the code produces no warnings, but let's not worry about it this week...

3. Argh, Expressions Matey (20 points)

Compiling Expressions.scala

This question uses the Scala parser combinator library, which is no longer bundled with Scala 2.13. Your starter repository includes scala-parser-combinators.jar; put it on the classpath when you run your code:

scala -cp .:scala-parser-combinators.jar Expressions.scala

(On Windows, use .;scala-parser-combinators.jar as the classpath separator.)

In Scala, algebraic datatypes can be defined with the use of abstract classes and case classes. Consider the following algebraic data type for expressions:

sealed abstract class Expr 
case class Variable(name: String) extends Expr 
case class Constant(x: Double) extends Expr 
case class Sum(l: Expr, r: Expr) extends Expr 
case class Product(l: Expr, r: Expr) extends Expr 
case class Power(b: Expr, e: Expr) extends Expr

This Scala code is equivalent to the following definition in ML:

datatype Expr = 
    Variable of string 
    | Constant of double
    | Sum of Expr * Expr 
    | Product of Expr * Expr 
    | Power of Expr * Expr;

Have a look at the starter code in Expressions.scala to see an example of Scala-style pattern matching on case classes.

Derive. Write a function that takes the derivative of an expression with respect to a given variable. Your function should have the following signature:
```
def derive(e: Expr, s: String): Expr
```
Your function does not have to take the derivative of Powers with non-constant exponents. It is acceptable to throw an exception in that circumstance.

Also, you'll likely need the Chain Rule for the Power case, though you may further restrict your code to handle only cases where the base is a variable or constant for simplicity if you prefer.

You can find various derivative rules here.

Tests

The main method for Expressions contains a number of test cases for derive and the other methods you will write. Uncomment those tests as you implement the methods. Those are the cases the autograder uses, so if you pass them you should be all set.

You will find a description of how those tests work below.
Evaluate. Write a function that evaluates a given expression in a given environment. An environment is just a mapping from symbols to values for those symbols. Your function should have the following signature:
```
def eval(e: Expr, env: Map[String, Double]): Double
```
If a variable in the expression is not in the given environment, you should throw an exception.
Plot. We've written a plot(e) function for you. It draws an ASCII plot of an expression treated as a function of x over the range [-10, 10]. Add a few calls to your main: for at least three expressions of your choice, plot both the expression and its derivative. For example:
```
val e = Expr("x^3 - 12*x");
plot(e);
plot(derive(e, "x"));
```
Notice that the same plot call renders the derivative — no new code needed, because derive returns an Expr and plot takes any Expr. Sanity-check visually: the zeros of the derivative should line up with the peaks and valleys of the original.
Simplify. Write a function that when given an expression reduces that expression to its simplest form. Your function should have the following signature:
```
def simplify(e: Expr): Expr
```
For example,
```
simplify(Sum(Variable("x"),Constant(0)))
```
should return Variable("x"). Your function need not be exhaustive, but should handle at least the following kinds of simplifications:
- Additive identity: x + 0 → x, 0 + x → x
- Multiplicative identity: x * 1 → x, 1 * x → x
- Multiplicative zero: x * 0 → 0, 0 * x → 0
- Constant folding: 2 + 3 → 5, 2 * 3 → 6
Note our tests assume that x ^ 1 is not simplified to x -- that is a valid simplification, but if you add it you may get some test failures... Feel free to experiment with more simplifications in the extra credit section below.

Testing with assertEquals

In order to make the task of writing tests easier, we provide an expression parser. The expression parser takes a string and returns its corresponding Expr. The expression parser can be invoked on a string str like so: Expr(str). For example, to demonstrate that your simplifier knows about the additive identity of zero, you might write the following test:

assertEquals(Expr("x"), simplify(Expr("x + 0")))

The syntax that the expression parser accepts can be expressed by the following grammar:

             expr := sum 
              sum := product { ("+" | "-") product } 
          product := power { "*" power } 
            power := factor [ "^" factor ] 
           factor := "(" expr ")" | variable | constant 
         variable := ident constant := floatLit 
         floatLit := [ "-" ] positiveFloat 
    positiveFloat := numericLit [ "." [ numericLit ] ]

4. Smarter `simplify` (extra credit, up to 20 points)

This problem is extra credit. You can try as little or as much as you'd like on this one.

Put your code in the file ExtraCredit.scala in your repository. It contains an ExtraCredit object with a stub simplify and a main method that runs through every example below, printing each input alongside its expected form. Compile and run with:

scalac -cp .:scala-parser-combinators.jar Expressions.scala ExtraCredit.scala
scala  -cp .:scala-parser-combinators.jar ExtraCredit

Note the two-step compile/run pattern. Unlike Problem 3 where you could just say scala Expressions.scala and Scala would compile and run the single file as a script — ExtraCredit.scala imports types and helpers from Expressions.scala, and Scala's script mode only compiles the one file you hand it. So you have to scalac both files into class files first and then scala the compiled object by name.

Note

Submit your Lab to both the "Lab 7" and the "Lab 7 Extra Credit" assignment in Gradescope if you work on this part!

The simplify from Problem 3 only rewrites a small, fixed set of patterns near the root of the tree. In this extra-credit problem, extend it to handle progressively trickier expressions. Each tier builds on the previous and is worth a few more points. The autograder tests the exact canonical forms listed below (left-associative, constants folded, commutative operands sorted with the constant on the left for * and on the right for +).

Tier 1 — recurse. Simplify children before applying rules at the root.

(x + 0) + (y * 1) → x + y
2 + 3 + 4 → 9

Tier 2 — more identity rules.

x ^ 0 → 1
1 ^ x → 1

Tier 3 — combine like terms.

x + x → 2 * x
x * x → x ^ 2
2 * x + 3 * x → 5 * x

Tier 4 — canonical form. Needed to catch commutative variants.

x * 2 + x * 3 → 5 * x
3 + x + 2 → x + 5
y + x → x + y
x * x * x → x ^ 3

Tier 5 — distribute, then recombine.

2 * (x + 3) → 2 * x + 6
(x + 1) * (x - 1) → x ^ 2 + -1
(x + 1) ^ 2 → x ^ 2 + 2 * x + 1

Does your simplifier terminate?

"Keep applying rules until nothing changes" is only safe if every rule strictly decreases some measure of the expression — size, node count, number of Product nodes containing a Sum, lexicographic order, etc. Tier 5 distribution actually grows the tree ((x+1)^2 → x^2 + 2*x + 1), so picking the right measure matters. If you ever write the inverse of a rule you already have (e.g. factoring x*y + x*z back into x*(y+z) alongside a distribution rule), your simplifier will loop forever. Convince yourself each rule is one-way with respect to some measure you can name.

In general, deciding whether a set of rewrite rules terminates is undecidable — there is no algorithm that takes an arbitrary rule set and answers yes or no. This is why real computer-algebra systems (Mathematica, Maple, SymPy, Maxima) rely on hand-proven termination arguments for their core simplification passes, expose "canonical form" routines that commit to a specific direction for each rule, and add explicit loop guards (Mathematica's $RecursionLimit, SymPy's internal fuel counters) for anything the author isn't sure about. Your toy simplifier sits in the same design space as those systems — just with far fewer rules.

Submitting Your Work

Submit your code to the GradeScope assignment named, for example, "Lab 1". You can submit in one of two ways:

Upload files: Click "Upload" and select all of your source files, or
Link GitHub: Click "GitHub" and select your repository and branch.

Please do not change the names of the starter files. Also:

If you worked with a partner, only one of each pair needs to submit the code.
Indicate who your partner is when you submit. Specifically, after you upload your files, there will be an "Add Group Member" button on the right of the Gradescope webpage -- click that and add your partner.

Autograding: Gradescope will run an autograder on your code that performs some simple tests. Be sure to look at the autograder output to verify your code works as expected. We will run more extensive tests on your code after the deadline.