CptS 215 PA1 Python Basics solution

$30.00

Original Work ?
Category: You will Instantly receive a download link for .ZIP solution file upon Payment

Description

5/5 - (1 vote)

Learner Objectives

At the conclusion of this programming assignment, participants should be able to:

  • Write Python code that utilizes:
    • File I/O
    • Strings
    • Lists
    • Command line arguments
  • Sort a list
  • Merge two lists together
  • Compute simple statistics

Prerequisites

Before starting this programming assignment, participants should be able to:

  • Run Python in interactive and script mode
  • Use Python variables, functions, conditionals, and loops
  • Perform arithmetic in Python

Acknowledgments

Content used in this assignment is based upon information in the following sources:

Overview and Requirements

Write a program (twitter_sort.py) that merges and sorts two twitter feeds. At a high level, your program is going to perform the following:

  1. Read in two files containing twitter feeds.
  2. Merge the twitter feeds in reverse chronological order (most recent first).
  3. Write the merged feeds to an output file.
  4. Provide some basic summary information about the files.

The names of the files will be passed in to your program via command line arguments. Use the following input files to test your program: tweet1.txt and tweet2.txt

The output of your program includes the following:

  1. Console
    1. The name of the file that contained the most tweets followed by the number of tweets tweeted. In the event of a tie, print both filenames along with the number of tweets (Note: a file may be empty).
    2. The five earliest tweets along with the tweeter.
  2. sorted_tweets.txt: the lines from the inputted files sorted in reverse chronological order (most recent tweets first and earliest tweets at the end).

Program Details

File Format

Each input file will contain a list of records with one record appearing on each line of the file. The format of a record is as follows:

@TWEETER "TWEET" YEAR MONTH DAY HR:MN:SC

Your job will be to read in each file and for each line in the file, create a record with the above information. In the above format, a tweet is a string that can contain a list of tokens. Also, HR:MN:SC should be treated as a single field of the record, the time.

Note: you should remove the “@” symbol from each tweeter’s name.

Reading from Files

You may use the provided Scanner class in the scanner.py module to help you parse different fields from the tweets.

Functions to Define

In addition to a main() function, define the following functions in your code:

  • read_records(): a function that given a filename creates a Scanner object and creates a record for each line in the file and returns a list containing the records
  • create_record(): a function that takes in a Scanner object and creates a record then returns a list representing the record; note, the “@” symbol should also be removed from the tweeter’s name
  • is_more_recent(): a function that compares two records based on date and returns True if the first record is more recent than the second and False otherwise
  • merge_and_sort_tweets(): a function that merges two lists of records based placing more recent records before earlier records and returns the merged records as a single list
  • write_records(): a function that takes in a list of records and writes to the file output each record on it’s own line.

Example Run

File 1 (tweet1_demo.txt):

@poptardsarefamous "Sometimes I wonder 2 == b or !(2 == b)" 2013 10 1 13:46:42
@nohw4me "i have no idea what my cs prof is saying" 2013 10 1 12:07:14
@pythondiva "My memory is great <3 64GB android" 2013 10 1 10:36:11
@enigma "im so clever, my code is even unreadable to me!" 2013 10 1 09:27:00

File 2 (tweet2_demo.txt):

@ocd_programmer "140 character limit? so i cant write my variable names" 2013 10 1 13:18:01
@caffeine4life "BBBBZZZZzzzzzZZZZZZZzzzZZzzZzzZzTTTTttt" 2011 10 2 02:53:47

Run the program: python twitter_sort.py tweet1_demo.txt tweet2_demo.txt sorted_demo.txt

Example Console Output

Reading files...
tweet1_demo.txt contained the most tweets with 4.
Merging files...
Writing file...
File written. Displaying 5 earliest tweeters and tweets.
caffeine4life "BBBBZZZZzzzzzZZZZZZZzzzZZzzZzzZzTTTTttt"
enigma "im so clever, my code is even unreadable to me!"
pythondiva "My memory is great <3 64GB android"
nohw4me "i have no idea what my cs prof is saying"
ocd_programmer "140 character limit? so i cant write my variable names"

Example Output File (sorted_demo.txt)

poptardsarefamous "Sometimes I wonder 2 == b or !(2 == b)" 2013 10 1 13:46:42
ocd_programmer "140 character limit? so i cant write my variable names" 2013 10 1 13:18:01
nohw4me "i have no idea what my cs prof is saying" 2013 10 1 12:07:14
pythondiva "My memory is great <3 64GB android" 2013 10 1 10:36:11
enigma "im so clever, my code is even unreadable to me!" 2013 10 1 09:27:00
caffeine4life "BBBBZZZZzzzzzZZZZZZZzzzZZzzZzzZzTTTTttt" 2011 10 2 02:53:47

Bonus (6 pts)

  • (3 pts) Use dictionaries to keep track of the number of times each has tag appears in the two input files. Hashtags are common tokens in social media that start with a “#” and are followed by a string of words (such as “#thisisahashtag”). Print the most common hashtag.
  • (3 pts) Try to figure out how many tweets go over the 140 character limit set by Twitter. Figure out how many tweets are “short” tweets with character ranges under 50 characters long. Keep track of all the character lengths for every tweet and at the end report the average character length for a tweet.

Submitting Assignments

  1. Use the Blackboard tool https://learn.wsu.edu to submit your assignment. You will submit your code to the corresponding programming assignment under the “Content” tab. You must upload your solutions as <your last name>_pa1.zip by the due date and time.
  2. Your .zip file should contain your .py files and all input .txt files used to test your program.

Grading Guidelines

This assignment is worth 100 points + 6 points bonus. Your assignment will be evaluated based on a successful compilation and adherence to the program requirements. We will grade according to the following criteria:

  • 15 pts for correct read_records()
  • 15 pts for correct create_record()
  • 15 pts for correct is_more_recent()
  • 15 pts for correct merge_and_sort_tweets()
  • 15 pts for correct write_records()
  • 5 pts for displaying the name of the file that contained the most tweets followed by the number of tweets tweeted
  • 5 pts for displaying the five earliest tweets along with the tweeter
  • 10 pts for a correct main() that drives your program
  • 5 pts for adherence to proper programming style and comments established for the class