CS4770 Pattern Recognition Assignment 3 Due: Oct 24, 2006 The purpose of this assignment is to get some hands-on experience with Neural Networks. There are plenty of neural network code available out there. Your TA will give you one that works. Use it to study the following: You must keep in mind the practical tips about Neural Networks before applying it to these problems. 1. 8-bit complement problem. Input: An 8-bit binary string as an 8-vector. Output: An 8-bit binary string that is the bit-wise complement of the input string. Training set: 204 (of the max 256 combinations) strings. Leave out every 5th number from the input string when counting from 0 to 255. (The left-out numbers could be 0, 5, 10, .... or 1, 6, 11, .... or 2, 7, ...: it is your choice.) Validation set: 20 numbers from the left-out numbers, chosen randomly. Testing set: The remaining numbers. You should train the network till the validation error starts to increase or for 5000 epochs, whichever is earlier. (a) Use a learning rate of 0.1 for the above. (b) Start with a learning rate of 0.4. After 50 epochs, set the learning rate to 0.98 times the current value. Give an HTML report that explains the encoding, initialization, activation function, etc. Plot the validation error and testing error against the iteration number. Since each bit is computing the complement of the corresponding bit, are the weights similar for each output/intermediate unit? Analyze the performance and report anything interesting that you observe. Discuss the convergence properties under (a) and (b) above. Credit will be given for further analysis of the problem, its variations, number of hidden nodes, starting points etc. Studying more will give you more insight into the Neural Network! 2. Take some really frivolous data like the following and try to train a Neural Network on it. *** Note: The TA has collected some data that is for a slightly different problem. If you are on your way already to solve the original problem (given below) with your own data, go ahead. Otherwise, you can use the data given and do the new problem. Original problem: Collect the following data about 100+ one-day matches. The input is the number of runs scored and the number of wickets that fell in the first 5 overs, encoded as a 29 bit string: First 5x5 bits give the runs with 5 bits per number in binary format and the last 4 bits to represent the number of wickets that fell as a binary number. The output is a 7-bit number indicating the range of runs scored by the team: 0-99, 100-149, 150-199, 200-249, 250-299, 300-349, 350-449. Use the data for both teams in every match. New problem: The data put up by the TA contains the following information about a number of one-day matches. It gives the runs at which each wicket fell in both innings of each match and the information about which team won the match. He will announce the format of the data to you. You have to train a neural network with the last digits of the scores at which the first 5 wickets fell. The input thus is a 40-bit string giving 10 (5 for each innings) numbers in the range 0-9, each encoded using 4 bits. If less than 5 wickets fell, use 15 as the 4-bit combination for the last wickets. The ouput is a single bit, 1 if the first team won the match and 0 if the second team won the match. Both: Train a 3-layer network with 10 hidden nodes. After training it using lots of matches, let us predict the results for the ongoing Challenger Cup! Let us see the accuracy! Discuss the results in a nice and exhaustive HTML report! As usual, you are invited to try out as many variations as possible to learn a bit about this tough problem.