CSE3221 (Summer 2013) Assignment #2

 

You have to work individually. You are not allowed to view or exchange documents with your peers. We treat all sorts of cheating very seriously. Do not copy code from anywhere, even as a `template''. No late submission will be accepted. Your work will be graded based on: i) correctness of programming logic; ii) clarity of your code and your coding style; iii) whether your code follows the specification. It is your responsibility to explain clearly every detail you do in the code with appropriate comments to avoid any possible confusion in marking.

 

Due date: 23:59, July 3rd (Wed).

 

Your task is to write a C program that creates two groups of Pthreads, an IN group and an OUT group, to create an exact copy of a source file passed as a command-line argument.

 

The original main thread is not part of either group. The main() function should open the source file, and create/initialize a circular buffer, and create all IN and OUT threads. Then, the main thread waits for all these threads to finish. All IN and OUT threads share the circular buffer. Each buffer slot stores 2 pieces of information: one data byte read from the source file and its offset in the source file. 

 

typedef  struct {

     char  data ;

     off_t offset ;

} BufferItem ;

 

Each IN thread goes to sleep (use nanosleep) for some random time between 0 and 0.01 seconds upon being created. Then, it reads the next single byte from the file and saves that byte and its offset in the file it to the next available empty slot in the circular buffer. Then, this IN threads goes to sleep (use nanosleep) for some random time between 0 and 0.01 seconds and then go back to read the next byte, until the end of file.

 

Similarly, upon being created, each OUT thread sleeps (use nanosleep) for some random time between 0 and 0.01 seconds and it reads a byte and its offset from the next available nonempty buffer slot, and then writes the byte to that offset in the target file. Then, it also goes to sleep (use nanosleep) for some random time between 0 and 0.01 seconds and goes back to copy next byte until nothing is left.

 

Along the way, each thread writes some information to two log files, so we can better trace the execution of your program.

 

Since all threads access common data, synchronization will be required.  You may wish to look at the man pages for pthread_create, pthread_mutex_init, sem_init and other related pthread API’s. Use critical sections of code.  You should make your critical sections as small as possible.  For example, IN threads should not have one big critical section where they do all of the following together: (a) read from the file; (b) write to the buffer; (c) write to the log file.  Instead, they should have 3 critical sections, one for each of (a) – (c). Similarly, writer threads should have separate critical sections for reading the buffer, writing the copy and writing the log file.

 

The program should be called copy.c and will be compiled with:

 

cc -Wall -o cpy copy.c -lpthread

 

It will be invoked as follows:

 

 cpy <nIN> <nOUT> <file> <copy> <bufSize> <IN_Log> <OUT_Log>

 

 <nIN> is the number of IN threads to create. There should be at least 1.

 <nOUT> is the number of OUT threads to create. There should be at least 1.

 <file> is the pathname of the file to be copied. It should exist and be readable.

 <copy> is the name to be given to the copy. If a file with that name already exists, it should be overwritten.

 <bufSize> is the capacity, in terms of BufferItem’s, of the  shared buffer. This should be at least 1.

<IN_Log> the IN threads write some trace information to this file. If a file with that name already exists, it should be overwritten.

<OUT_Log> the OUT threads write some trace information to this file. If a file with that name already exists, it should be overwritten.

  

As example, we have posted a data sets on the course Web: dataset4 for your reference.


The log files

=============

 

The two log files have no part in the file copying, but let us better trace the execution of your program.

 

Each of the IN threads should be given a different number in the range 0 … <nIN>-1.  Each of the OUT threads should be given a different number in the range 0 … <nOUT>-1. Each thread should know its own number. (This number is different from the thread id.)

 

When an IN thread reads the next unread byte from the file, it can obtain the offset using the lseek system call. When the IN thread saves the byte and its offset to the buffer, it writes to a particular index in the buffer.  Each time an IN thread number n reads a byte from offset x in the file and writes it to index i in the buffer, it should write the line n x i to the <IN_Log> file.  More exactly, it should write its thread number, followed by a single blank, followed by the offset in the file, followed by a single blank, followed by the index in the buffer, followed by a newline character '\n'.

 

Similarly, each OUT thread also writes n x i to the <OUT_Log> file. More exactly, it writes its thread number, followed by a single blank, followed by the offset in the file where it writes its byte, followed by a single blank, followed by the index in the buffer where it read its byte, followed by a newline character '\n'.

 

Do not deviate from this format and do not add heading, summary information or user prompts.  This will make it much easier for us to examine these files using a shell script.

 

What to submit?

 

Submit the C program using the following command from your PRISM account:

 

submit 3221 a2 copy.c

 

Include the following information (please complete) as a comment at the beginning of your C program:

/*

Family Name:

Given Name:

Section:

Student Number:

CS Login:

*/

 

No hardcopy is needed for this assignment.