Keyword search program using multi-thread Discussion

Project DescriptionYou will design and implement a keyword search program called ks_bb, which will take two command line arguments from the user as follows:./ks_bb commandFile bufSizecommandFile is the name of a file, which includes the search commands to be performed. Each searchcommand is composed of a keyword and a dirPath. A sample content of a commandFile can be asfollows:/home/user/dirOne is/home/user/dirOne a/home/user/dirTwo aFor each command, your program will search the keyword in all the files sitting in the givendirectory using multiple threads. You can make the following assumptions about the commandFile:1. The dirPath is a full path2. The keyword is a single word, starting with a whitespace character and ending with a whitespacecharacter, not including any whitespace character in themiddle. A whitespace character is eithera , , or character. There is no empty keyword specified.3. Each line of the commandFile will never exceed MAXLINESIZE characters (including thedirectory path, keyword, and whitespace(s) in between)Your program will take the commands from the commandFile one by one. Then it will create achild process to serve the command. The command info will be passed to the child process. The childprocess will then create a number of threads (one thread per input file in the given directory) to findout the matching lines in the input files. These will be the worker threads. A worker thread will beresponsible from exactly one input file. It will scan the input file and look for matching lines. Youcan assume that there will be only files in the given dirPath, no sub-directories.A match in a line will be an exact match of an entire word only and the maximum length ofa line in an input file will never exceed MAXLINESIZE. When a match is found in a line, it willprepare an item that includes the following information: the name of the file where match occurred,the line number, and the line itself (i.e. the line string excluding the newline character at the end).If the keyword is seen, for example, in 5 separate lines, then 5 separate items will be created. If thekeyword is seen more than once in a line, then a single item will be created for that line. For example,if we had a line as ‘a a a’, then the number of matches on this line would be 3. However, we haveto generate a single item. Whenever an item is created, then the worker thread will add this item to amemory buffer: a bounded buffer.The second argument of your program (bufSize) specifies the size of the bounded buffer to be usedby the threads. Each worker thread will add the produced items to this bounded buffer. The buffercan hold at most bufSize items. This bounded buffer can be implemented in one of two ways: 1) as alinked list of items; or 2) as a circular array of pointers to items. You can choose either one of theseimplementation options. A worker thread will add a new item to the end of the buffer. The bufferhas to be accessed by all worker threads. While trying to insert an item, if a worker thread finds thebuffer full, then the thread has to go into sleep (block) until there is space for one more item. Since allworker threads will access the buffer concurrently, the access must be coordinated. Otherwise we canhave race conditions. You can use semaphores/mutexes/condition variables to protect the buffer andto synchronize the threads so that they can sleep and wake-up when necessary. Beside the worker threads, there will be another thread created by the child process while servinga request. That other thread will be identified as a printer thread. The job of that thread will be toretrieve the items from the bounded buffer and print them to the screen. While the items are added tothe buffer by the worker threads, the printer thread can work concurrently and try to retrieve the itemsfrom the buffer and print them to the screen. The printer thread will try to retrieve one item from thebounded buffer. If the buffer is empty, the printer thread has to go into sleep until the buffer has atleast one item. When the printer thread is successful in retrieving an item from the buffer, it will printit to the screen in the following format:filename:linenumber:linestringFor example, if the keyword is ‘a’ and the content of the file to be searched (assume the file name isfile1.txt) is as follows:I raised my daughter in the Americanfashion; I gave her freedom, buttaught her never to dishonor herfamily. She found a boy friend,not an Italian. She went to themovies with him, stayed out late.Two months ago he took her for adrive, with another boy friend.They made her drink whiskey andthen they tried to take advantageof her. She resisted; she kept herhonor. So they beat her like ananimal. When I went to the hospitalher nose was broken, her jaw wasshattered and held together bywire, and she could not even weepbecause of the pain.Then, the printer thread will print the following output to the screen:file1.txt:4:family. She found a boy friend,file1.txt:7:Two months ago he took her for aPay attention to the following issues:• The output should not contain the name of the directory, but just the name of the file.• Only the lines that include the exact match of an entire word is printed.• For each search command read from the commandFile, a specific bounded buffer, a specificprinter thread, and bunch of worker threads (one for each file in the given dirPath) are createdby the child process that is assigned for that search command.• All the worker threads for a given command do the same thing on a different files and fill thesame bounded buffer.• Your program passes each received search command to a child process and handle the nextsearch command immediately without waiting for the created child to complete its execution.However, you should also make sure that your program does not terminate before all the outputis printed. To achieve this, you can count the number of children created by the parent andafter handling all the requests (end of command file is reached), then the parent process can callwait() system call in a loop for count times.You can assume that theMAXLINESIZE is 1024. Also, remember to use thread-safe library functions inside your threads! (for instance strtok_r() instead of strtok()).Compiler: gccSubmit file:    ks_bb.c

Looking for this or a Similar Assignment? Click below to Place your Order

Open chat
%d bloggers like this: