Description
Introduction
There is an interesting data set created by the U.S. Social Security Agency that stores the most
popular first names of babies born in the U.S. since the 1880’s. You can information about it at
https://www.ssa.gov/OACT/babynames/index.html. We will be working with the most popular
names for male and female babies in each decade from 1880 to 2010. The information will be
contained in 14 files with the following names:
1880Names.txt 1890Names.txt 1900Names.txt 1910Names.txt 1920Names.txt
1930Names.txt 1940Names.txt 1950Names.txt 1960Names.txt 1970Names.txt
1980Names.txt 1990Names.txt 2000Names.txt 2010Names.txt
File Description
Each file contains 200 lines representing the top 200 names for male and female babies. The
format of each line is as follows:
1 John 89,950 Mary 91,668
• [1] The first number is the rank, i.e. 1 means that these are the most popular male and
female names in the decade from 1880 to 1889.
• [John 89,950] This is followed by the male baby name and then the number of babies
with that name born in the decade.
• [Mary 91,668] The fourth item is the female baby name of rank 1, followed by the
number of babies with that name.
Program 1: babyQuery.c
Source Code Files
Your program will have the name babyQuery.c and you will also use the header file
babies.h. In babies.h, you will find the definitions that you will need for your program.
It has the following contents:
/* Defines */
#define MAXLENGTH 20
#define ROWS 200
/* Struct definitions */
struct pNames {
int year;
int rank[ROWS];
char maleName[ROWS][MAXLENGTH];
int maleNumber[ROWS];
char femaleName[ROWS][MAXLENGTH];
int femaleNumber[ROWS];
};
/* Function definitions */
int removeCommas ( char * );
You may add to this header file as needed but you cannot change want is already in the file.
Functionality
The program will accomplish the following tasks:
• Read in all the information about a decade that the user requests, e.g. if the user wants
information about the 1880’s then you must read in the file 1880Names.txt.
o As part of the input process you will have to eliminate the commas that appear
in the numbers in the input files, e.g. the string 89,950 has to be changed to
89950 before being sent to atoi(). This must be done in a function called
removeCommas() which will take one parameter, a pointer to a character
array. The function will return the number of commas removed from the string.
• Store this information in a structure that will be given to you in the header file
babies.h.
• You will then ask your user questions that will allow you to find the following types of
information:
o For a given rank, what is the (male, female, both) name, e.g. in the 1880’s, the
female name of rank 1 is Mary.
o The top 10 names (male and female) for the given decade.
o Given a name (female, male or both), find the rank for the given decade.
Question Script
The questioning of the user must follow the following script:
$ ./babyQuery
What decade do you want to look at? [1880 to 2010]: 1880
Would you like to see a rank, search for a name, or see the top 10? [rank, search, top]: rank
Now there are three different paths for questioning:
Path 1: rank
What rank do you wish to see? [1-200]: 2
Would you like to see the male (0), female (1), or both (2) name(s)? [0-2]: 2
Rank 2: Male: William (84881) and Female: Anna (38159) if response is 2
Rank 2: Male: William (84881) if response is 0
Rank 2: Female: Anna (38159) if response is 1
Path 2: search
What name do you want to search for? [case sensitive]: Emily
Do you wish to search male (0), female (1), or both (2) name? [0-2]: 1
In 1880, the female name Emily is ranked 91 with a count of 3368. if response is 1
In 1880, the male name Emily is not ranked. if response is 0 and the name is not found
In 1880, the female name Emily is ranked 91 with a count of 3368 and the male name Emily is
not ranked. if response is 2 – the female name will always go first even if it is not found
Path 3: top
1 John 89950 Mary 91668
2 William 84881 Anna 38159
3 James 54056 Emma 25404
4 George 47651 Elizabeth 25006
5 Charles 46656 Margaret 21799
6 Frank 30967 Minnie 21724
7 Joseph 26292 Ida 18283
8 Henry 24139 Bertha 18263
9 Robert 24074 Clara 17717
10 Thomas 23750 Alice 17142
Notice that the columns are lined up. The number of spaces between each column is not less
than 3 and not more than 8. The number of spaces is not the point, the point is that the
columns are aligned and look pleasing to the eye.
After the answer has been presented to the user the following questions will be asked:
Do you want to ask another question about 1880? [Y or N]: Y
If the response is Y then return to the question “Would you like to see a rank, search for a
name, or see the top 10? [rank, search, top]: ”.
If the response is N then ask the following:
Would you like to select another year? [Y or N]: Y
If the response is Y then return to the question “What decade do you want to look at? [1880 to
2010]: “.
If the response is N then terminate the program with the message:
Thank you for using babyQuery.
Error Checking
Error checking is extremely important when users are giving information to the program. For
all of the questions asked of the user, you must check that the input is exactly what was asked
for. Let us examine the various responses requested from the user:
• What decade do you want to look at? [1880 to 2010]:
o The response must be 1880, 1890, 1900, 1910, 1920, 1930, 1940, 1950, 1960,
1970, 1980, 1990, 2000, or 2010. No other numbers are acceptable.
• Would you like to see a rank, search for a name, or see the top 10? [rank, search, top]:
o The user must type in rank or search or top – all lower case and all spelled
correctly and in full.
• What rank do you wish to see? [1-200]:
o Only numbers between 1 and 200 are acceptable.
• Would you like to see the male (0), female (1), or both (2) name(s)? [0-2]:
o Only the numbers 0, 1, or 2 are acceptable.
• What name do you want to search for? [case sensitive]:
o The requested string is to be treated as case sensitive (names in the files have
the first letter in upper case and the rest in lower case). If they enter a name
that does not follow this format, the string is to be accepted as input but the
program is to do nothing to “fix” the case and thus the request will fail.
• Do you wish to search male (0), female (1), or both (2) name? [0-2]:
o Only the numbers 0, 1, or 2 are acceptable.
• Do you want to ask another question about 1880? [Y or N]:
o The user is to respond with a single letter, either Y or N but it is to be treated in a
case insensitive manner; i.e. y and n are acceptable.
• Would you like to select another year? [Y or N]:
o The user is to respond with a single letter, either Y or N but it is to be treated in a
case insensitive manner; i.e. y and n are acceptable.
If the user makes an error, the program is to give an error message and then repeat the
question. The following are the error messages that are to be given:
• What decade do you want to look at? [1880 to 2010]:
o Acceptable decades are 1880, 1890, 1900, 1910, 1920, 1930, 1940, 1950, 1960,
1970, 1980, 1990, 2000, or 2010. No other numbers are acceptable.
• Would you like to see a rank, search for a name, or see the top 10? [rank, search, top]:
o Please type in rank, search, or top exactly as requested.
• What rank do you wish to see? [1-200]:
o Only numbers between 1 and 200 are acceptable.
• Would you like to see the male (0), female (1), or both (2) name(s)? [0-2]:
o Only the numbers 0, 1, or 2 are acceptable.
• What name do you want to search for? [case sensitive]:
o No error message is needed for this question.
• Do you wish to search male (0), female (1), or both (2) name? [0-2]:
o Only the numbers 0, 1, or 2 are acceptable.
• Do you want to ask another question about 1880? [Y or N]:
o Only the single characters Y or N are acceptable.
• Would you like to select another year? [Y or N]:
o Only the single characters Y or N are acceptable.
CIS1300 Assignment 3 (Part 2) – What’s in a Name?
Introduction
Welcome to Part 2 of Assignment 3. The challenge in this part of the assignment (worth 15% of
the total grade) is to use 2 data sets at the same time to do some comparisons.
Program 2: babiesQuery.c
Source Code Files
Your program will have the name babiesQuery.c and you will also use the header file
babies.h. In babies.h, you will find the definitions that you will need for your program.
It has the following contents:
/* Defines */
#define MAXLENGTH 20
#define ROWS 200
/* Struct definitions */
struct pNames {
int year;
int rank[ROWS];
char maleName[ROWS][MAXLENGTH];
int maleNumber[ROWS];
char femaleName[ROWS][MAXLENGTH];
int femaleNumber[ROWS];
};
/* Function definitions */
int removeCommas ( char * );
You may add to this header file as needed but you cannot change want is already in the file.
Functionality
The program will accomplish the following tasks:
• Read in all the information about multiple decades that the user requests, e.g. if the
user wants to compare the 1880’s to the 1980’s then you must read in the file
1880Names.txt and 1980Names.txt.
o You can also decide to load all of the Names files into your program before you
ask the user which decades they want.
• As before you will store this information in the structure given to you in the header file
babies.h.
o Once again you have a choice – you can create two structures, for example
struct pNames decade1;
struct pNames decade2;
o Or you can store the Names data in an array of type struct pNames
struct pNames decades[num] where num is from 2 to 14.
• You will then ask your user questions that will allow you to find the following types of
information:
o For a given rank, what is the (male, female, both) name , e.g. in the 1880’s and
the 1980’s, the female name of rank 1 is Mary in 1880 and Jessica for 1980.
o The top 10 names (male and female) for the given decades that are the same.
o Given a name (female, male or both), find the rank for the given decades.
Question Script
The questioning of the user must follow the following script:
$ ./babiesQuery
What 2 decades do you want to look at? [1880 to 2010]: 1880 1980
Would you like to see a rank, search for a name, or see the top 10? [rank, search, top]: rank
Now there are three different paths for questioning:
Path 1: rank
What rank do you wish to see? [1-200]: 2
Would you like to see the male (0), female (1), or both (2) name(s)? [0-2]: 2
Rank 2: 1880: Male: William (84881) and Female: Anna (38159)
1980: Male: Christopher (554984) and Female: Jennifer (440871) if response is 2
Rank 2: 1880 Male: William (84881)
1980 Male: Christopher (554984) if response is 0
Rank 2: 1880: Female: Anna (38159)
1980: Female: Jennifer (440871) if response is 1
Path 2: search
What name do you want to search for? [case sensitive]: Emily
Do you wish to search male (0), female (1), or both (2) name? [0-2]: 1
In 1880, the female name Emily is ranked 91 with a count of 3368 and
In 1980, the female name Emily is ranked 25 with a count of 131755. if response is 1
In 1880, the male name Emily is not ranked and
In 1980, the male name Emily is not ranked. if response is 0 and the name is not found
In 1880, the female name Emily is ranked 91 with a count of 3368 and the male name Emily is
not ranked
And in 1980, the female name Emily is ranked 25 with a count of 131755 and the male name
Emily is not ranked. if response is 2 – the female name will always go first even if it is not found
Path 3: top
Male names that appear in both 1880 and 1980 Top Tens: John, James, Joseph, Robert
Female names that appear in both1880 and 1980 Top Tens: Elizabeth
After the answer has been presented to the user the following questions will be asked:
Do you want to ask another question about 1880 and 1980? [Y or N]: Y
If the response is Y then return to the question “Would you like to see a rank, search for a
name, or see the top 10? [rank, search, top]: ”.
If the response is N then ask the following:
Would you like to select other decades? [Y or N]: Y
If the response is Y then return to the question “What 2 decades do you want to look at? [1880
to 2010]: “.
If the response is N then terminate the program with the message:
Thank you for using babiesQuery.
Error Checking
Error checking is extremely important when users are giving information to the program. For
all of the questions asked of the user, you must check that the input is exactly what was asked
for. Let us examine the various responses requested from the user:
• What 2 decades do you want to look at? [1880 to 2010]:
o The response must be two of 1880, 1890, 1900, 1910, 1920, 1930, 1940, 1950,
1960, 1970, 1980, 1990, 2000, or 2010, separated by at least one space. No
other numbers are acceptable. Both numbers cannot be the same.
• Would you like to see a rank, search for a name, or see the top 10? [rank, search, top]:
o The user must type in rank or search or top – all lower case and all spelled
correctly and in full.
• What rank do you wish to see? [1-200]:
o Only numbers between 1 and 200 are acceptable.
• Would you like to see the male (0), female (1), or both (2) name(s)? [0-2]:
o Only the numbers 0, 1, or 2 are acceptable.
• What name do you want to search for? [case sensitive]:
o The requested string is to be treated as case sensitive (names in the files have
the first letter in upper case and the rest in lower case). If they enter a name
that does not follow this format, the string is to be accepted as input but the
program is to do nothing to “fix” the case and thus the request will fail.
• Do you wish to search male (0), female (1), or both (2) name? [0-2]:
o Only the numbers 0, 1, or 2 are acceptable.
• Do you want to ask another question about 1880 and 1980? [Y or N]:
o The user is to respond with a single letter, either Y or N but it is to be treated in a
case insensitive manner; i.e. y and n are acceptable.
• Would you like to select other decades? [Y or N]:
o The user is to respond with a single letter, either Y or N but it is to be treated in a
case insensitive manner; i.e. y and n are acceptable.
If the user makes an error, the program is to give an error message and then repeat the
question. The following are the error messages that are to be given:
• What 2 decades do you want to look at? [1880 to 2010]:
o Acceptable decades are 1880, 1890, 1900, 1910, 1920, 1930, 1940, 1950, 1960,
1970, 1980, 1990, 2000, or 2010. No other numbers are acceptable. You must
enter 2 acceptable decades separated by a least one space.
• Would you like to see a rank, search for a name, or see the top 10? [rank, search, top]:
o Please type in rank, search, or top exactly as requested.
• What rank do you wish to see? [1-200]:
o Only numbers between 1 and 200 are acceptable.
• Would you like to see the male (0), female (1), or both (2) name(s)? [0-2]:
o Only the numbers 0, 1, or 2 are acceptable.
• What name do you want to search for? [case sensitive]:
o No error message is needed for this question.
• Do you wish to search male (0), female (1), or both (2) name? [0-2]:
o Only the numbers 0, 1, or 2 are acceptable.
• Do you want to ask another question about 1880 and 1980? [Y or N]:
o Only the single characters Y or N are acceptable.
• Would you like to select other decades? [Y or N]:
o Only the single characters Y or N are acceptable.