This assignment should be turned in for evaluation. To turn in the assignment, you will use the turnin command:
turnin -c 010 concordance.c
It is ok to turn in the same file more than once. This will overwrite the earlier version. You may want to do this if you discover a mistake or make an improvement after turning in your assignment.
To read input, simply use scanf to break the input up into words:
scanf ("%s", nextWord);
Keep in mind that memory for nextWord must be allocated prior to calling scanf. scanf will then return in nextWord all consecutive characters appearing between whitespace (blanks, carriage returns, tabs, ...). Unfortunately, this means that some words will actually be numbers or punctuation or some combination of letters and these non-letters. For the purposes of this assignment, a word should contain only alphabetic characters. So, after scanf reads in a "word", send it to a function isWord (that you need to write) that should return true if the word contains only alphabetic characters. If it does, add it to the concordance.
scanf reads from standard input. It makes sense for this program to read from a file instead. Since we haven't done file manipulation yet, you don't know how to do that in C. To read from a file you should use Unix input redirection to tell your program that it should get its standard input from a file instead of from the keyboard. Here is how you do that:
-> concordance < my_input_file
Assuming I have a file called my_input_file in my current directory, the contents of this file will be sent to the concordance program just as if they had been entered from the keyboard. When the end of the file is reached scanf will detect this and will return a 0 to indicate that it was unable to set nextWord because it had run out of input.
Here is an example of how the program should behave:-> more sample.txt Alice was beginning to get very tired of sitting by her sister on the bank, and of having nothing to do: once or twice she had peeped into the book her sister was reading, but it had no pictures or conversations in it, `and what is the use of a book,' thought Alice `without pictures or conversation?' -> concordance < sample.txt Alice a and beginning book but by conversations get had having her in into is it no nothing of on once or peeped pictures she sister sitting the thought tired to twice use very was what ->
You must decide how to define the concordance itself. Here are some options:
I'll leave it up to you to decide which implementation to use. I'm more concerned that you get practice with C than that you produce the fastest concordance around. To be sure that you do get more practice, don't use either of the first two solutions (1 long array or 1 long vector). If you do use a vector as part of one of the other suggested approaches, you can reuse the vector implementation from class and practice assignments in the solution.
Be sure to free the concordance after printing it out. Along the way, be sure to watch out for memory leaks and dangling references!
As always, it's a good strategy to write one function, test it. Don't go on to the next function until the first one is working.
If you run your program on alice.txt, it should generate a list of about 2200 words.
The sample.txt and alice.txt files are in my shared/cs010/assignment3 directory.