EECS 2031 Software Tools, Winter 2014
Lab 5 (lab open during first 30 minutes, then closed in labtest mode)
Cellphones and other electronic devices must be off while you are in the lab.
Background Scenario
This lab uses the same bathroom scale data transmission scenario as
lab2 to lab4.
However, this time you have to adapt your solution for the previous labs to
output an overview of the weight change of the last valid user at the end of the input.
In the following and for ease of reading, the changes relative to lab5
have been marked below with '***'.
As before, each line of input contains the following information:
timestamp userID weight
There are one more space characters separating the three pieces of information, except that
the userID itself may now contain one or more spaces (as in "1000 John Jack Smith 123.45").
The fields are defined as follows:
- The timestamp is an integer with the number of seconds since 00:00, Jan 1, 1970 UTC,
which conforms to the standard specification of time in Unix/Linux systems.
- The userID is a string, which may contain spaces.
- The weight as a floating point number, specified to mean the weight in kilograms.
This protocol can be parsed deterministically,
as the weight is the first numeric field after a sequence of text fields, e.g. as in
timestamp userIDa userIDb userIDc userIDd weight and none of the parts of
the userID can start with a digit or '.'.
Objective
- Write an ANSI-C program called lab5.c that reads lines from standard input,
parses them, and classifies the input according to the context set out above and the requirements
specified below.
- In the initial open lab, you are welcome to create a draft solution. You need to submit that using the command
submit 2031 pre5 lab5.c
With this, your initial work will be available to you in the closed part of the labtest.
You can submit other text files, such as files with input data, as well.
- Once the labtest starts, your work will be available to you in a subdirectory
called unsubmit in your home directory. This is a read-only location.
So you have to copy these files out first, e.g. via the following command
cp unsubmit/pre5/* ~
- Test that your program correctly implements the required functionality
- Finally, submit your solution electronically before the end of
the lab test using the command
submit 2031 lab5 lab5.c
- You may submit your solution more than once. Additional documentation about the submit
command can be viewed by typing man submit.
Now create a new ANSI-C program that does the following.
Requirements
Your program must the input read line by line from standard input.
- If the timestamp field is missing, is not an integer, or is zero,
you must print Invalid time, followed by a newline character.
- The userID consists of a sequence of text strings each of which cannot start with a
digit or '.' character.
If the userID is missing, or if the userID is longer than 179 characters (including single
spaces in between the parts of the userID) you must print
Illegal userID, followed by a newline character.
- If the weight field is missing, or is less than 30.0 or more than 300.0,
you must print Illegal weight, followed by a newline character.
- If the timestamp of a record not larger than the previous timestamp,
i.e., the current is in the past relative to the previous one or identical to it,
you must print Nonmonotonic timestamps, followed by a newline character.
- Additional text on the line after the last field should be silently ignored.
- If there are multiple problems with the line of input, you must print only
the message for the first field that does not follow the specification. Processing should
then continue with the next line. All rules need to be applied in the order specified here.
- Data from lines that pass all above criteria are considered valid records.
-
If the information supplied in each line is otherwise fine,
and the userID is new, i.e., has not been seen before, you must print OK newuser
followed by a newline character.
Different amounts of whitespace
between the parts of a userID should be ignored in the test for equality of userID's.
- Otherwise, the userID has been seen before.
If the weight change between the current record and the last valid record
for the same userID exceeds 10 kg/day,
you must print Suspiciously large weight change, followed by a newline character.
Otherwise, you must print OK, followed by a newline character.
- ***
When the processing reaches the end of the input, i.e., upon EOF, the program
must output a bar chart, using ASCII characters.
This bar chart must contain vertical and horizontal axes drawn with '|' and '-' characters,
respectively. The origin must be represented as a '+' character.
The "length" of the horizontal dimension needs to match the amount of data that can be shown,
as specified below.
Added: The bar chart must always be 10 lines "high" (plus the line for the horizontal axis).
- ***
The bar chart must show the change of weight over time for the userID with the last valid record.
For this, you must scale the information so that
30kg and 300kg are represented by 1 and 10 '*' characters respectively as vertical bars.
You must truncate weight values, i.e., a weight of 59.999kg will still yield only a single star.
To simplify the lab, the time dimension on the horizontal axis must correspond only
to the sequence of weight records, and must not be spaced according to the time stamps themselves.
Thus, each record should be respresented by a single character in the horizontal dimension.
- ***
You are required to use a dynamically allocated 2D array for your solution.
- No other output must be produced.
For the purpose of this lab, you do not need to worry about overflow. In other words, you
can safely assume that timestamps are guaranteed to fit in 32 bit integers,
userID's will not be longer than
1000 characters including spaces, and floating point numbers will fit
into a ANSI-C float variable.
You can also safely assume that there is always at least one (or more) space characters
between the fields and between each part of the userID.
Moreover, each line of input is guaranteed to be less than 1024 characters long.
Assuming that the program is started with lab5, and given the following input, which
is also provided for convenience as a file input.txt:
3600 godzilla's kid 30.0
36000 godzilla 299
36001 godzilla's kid 30.1
36002 godzilla's kid 30.2
46002 godzilla's kid 30.1
46999 godzilla 30.1
60000 john jack andrew wolfgang jiang rami tom bob robert frank richard smith 123.5
500000 godzilla's kid 60
1000000 godzilla's kid 100
3000000 godzilla's kid 300
5000000 godzilla's kid 89
your program should create the following output, provided for convenience as
a file expectedoutput.txt (file and example output updated, 2 times):
OK newuser
OK newuser
OK
Suspiciously large weight change
OK
Suspiciously large weight change
OK newuser
OK
OK
OK
OK
| *
| *
| *
| *
| *
| *
| *
| **
| ****
|********
+--------
Hints:
- I suggest you first change your code to store a weight record for each valid record.
Insert records either at the end or at the beginning of the list, but keep the order consistent.
Then, at the end of the program, identify how many records there are for the last
valid userID, allocate an appropriately sized two-dimensional array of characters, iterate over
the records for the last valid userID and draw the bars into the array.
For each bar use a single for-loop that adds characters in the vertical direction.
Then simply print out each line of the array to generate the final bar chart.
- Given that there is no specified limit on the number of weight records,
I suggest that you use a linked list to store the userID's.
- It is usually easier to address each individual requirement to your code only after
you have verified that the previous requirement is met by your program.
- Remember to look up information related to this lab well in advance or (at the very latest)
in the first 30 minutes of the lab.
- Many of the test cases for previous labs are still valid for lab5.
You can use those test cases to test if your submission deals correctly with the
unchanged aspects of the specification.
Additional hints:
- The entries in the list are (very) likely stored in either increasing or decreasing
time order. For either case, you only need to adapt your processing for drawing the bars
into the 2D array to work from left-to-right or right-to-left, respectively.
- Drawing the bars from the bottom (stalagmites) is slightly more complex than drawing
them hanging from the top (stalactites). Consequently, consider the strategy of the bar
chart in memory as "hanging from the top" and then only outputting the bar chart bottom up
by reversing the direction of the final printing loop.
Added: How to quickly identify the cause of a Segmentation violations
To identify the very likely cause of an illegal memory access, first
(re-)compile your program with the "-g" flag and start in the debugger as follows:
gcc -g -o lab5 lab5.c
gdb lab5
Now you will get the debugger prompt. Type in one of the two the following commands:
run
run < input.txt
depending if you want to use predefined input or provide input yourself.
Once your program stops
due to a segmentation violation you will see where it stopped. To get a bit more
information, use the following command:
where
which will show you a stack trace, similar to a Java exception. This will identify the
line currently executing in the program. If you see multiple lines, that may mean the
program crashed inside a function, e.g., because you passed a NULL pointer to strcmp().
In that case look for the line that lists your source file (lab4.c). Note the number
at the beginning of that line and then "jump" to that location via
frame number
E.g., "frame 1".
Now use the
list
command to see the code around this location. Then use
print expression
to look at the value of a variable, e.g. "print p". If you see the value of a pointer
being zero, then the likely problem is a dereference of null pointer. Otherwise, you
are likely indexing with the statement identified above into a memory location
that was not malloc()'ed by the program,
typically because you are indexing beyond the end of an array/memory region.
A good indication for that is that if a given print command fails, then the pointer
or array index itself is invalid.
Note that the print command permits (most) C expressions, including "print *p",
"print p->next" and "print a[i]", which gives you several options to diagnose a problem.
quit
will quit the debugger. You can now fix your program according to the insight you gained.