BBM 103 ASSIGNMENT 5- Movie Reviews solved

$25.00

Original Work ?
Category: You will Instantly receive a download link for .ZIP solution file upon Payment

Description

5/5 - (2 votes)

Introduction
Movie reviews are a fairly commonly used tool used by consumers to understand if a movie is worth the price and time. There are different methods to create reviews about movies. One of them is rating the movies by different users. GroupLens Research has collected and made available rating datasets from the MovieLens web site (http://movielens.org). We used extra information about movies, Dennis Schwartz’s reviews.
In this assignment, you will implement a python program that analyzes GroupLens’ data and compares them with Dennis Schwartz’s reviews. This program will create html files for movies which are both in Dennis Schwartz’s reviews and in GroupLens’ data and try to guess genres of movies based on the data which obtained from movies.
Fall 2016 BBM103: Introduction to Programming Laboratory 1 T.A. : Res. Assist. (Necva BOLUCU, Selma DILEK, Burcu YALCINER, Selim YILMAZ)
2
Stage 1: Create HTML Files for Movies
Step 1: Understand the GroupLens’ data
In this assignment, we will give you different files to analyze. The most important stage is understanding the data.
 u.item
Information about the items (movies);
The last 19 fields are the genres, a 1 indicates the movie is of that genre, a 0 indicates it is not; movies can be in several genres at once.
The movie ids are the ones used in the u.data.
Example: The content of the data
Analyzing a line:
1|Toy Story (1995)|01-Jan-1995|http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0
2|GoldenEye (1995)|01-Jan-1995|http://us.imdb.com/M/title-exact?GoldenEye%20(1995)|0|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0
3|Four Rooms (1995)|01-Jan-1995|http://us.imdb.com/M/title-exact?Four%20Rooms%20(1995)|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0
4|Get Shorty (1995)|01-Jan-1995|http://us.imdb.com/M/title-exact?Get%20Shorty%20(1995)|0|1|0|0|0|1|0|0|1|0|0|0|0|0|0|0|0|0|0
1176|Welcome To Sarajevo (1997)|01-Jan-1997 |http://us.imdb.com/M/title-exact?Welcome+To+Sarajevo+(1997)|0|0|0|0|0|0|0|0|1|0|0|0|0|0|0|0|0|1|0
movie id | movie title | video release date | IMDb URL | unknown | Action | Adventure | Animation | Children’s | Comedy | Crime | Documentary | Drama | Fantasy | Movie-Noir | Horror | Musical | Mystery | Romance | Sci-Fi | Thriller | War | Western |
Movie id : 1176
Movie title : Welcome To Sarajevo (1997)
Release date : 01-Jan-1997
IMDB Link :http://us.imdb.com/M/title-exact?Welcome+To+Sarajevo+(1997)
Genre : 0|0|0|0|0|0|0|0|1|0|0|0|0|0|0|0|0|1|0
Fall 2016 BBM103: Introduction to Programming Laboratory 1 T.A. : Res. Assist. (Necva BOLUCU, Selma DILEK, Burcu YALCINER, Selim YILMAZ)
3
 u.genre
This file contains a list of the genres.
You will use this file to format genre field which are taken from u.item.
Example: Convert genre by taking genre names from u.genre file

 u.user
This file contains demographic information about the users; (The user ids are the ones used in the u.data data set.)
 u.occupation
This file consists of list of the occupations. (The occupation ids are the ones used in the u.user data set.)
Analyzing a line of u.user file by using u.occupation file:
User id : 1
User Age : 24
Gender : M
Occupation: technician
Zip Code : 85711
genre | genre id
Movie id :1176
Movie title : Welcome To Sarajevo (1997)
Genre : Drama War
user id | age | gender | occupation id | zip code
occupation id | occupation
Fall 2016 BBM103: Introduction to Programming Laboratory 1 T.A. : Res. Assist. (Necva BOLUCU, Selma DILEK, Burcu YALCINER, Selim YILMAZ)
4
 u.data
The full data set, 100000 ratings by 943 users on 1682 items comprised of this file.
Step 2: Understand the Dennis Schwartz’s data
Dennis Schwartz’ review data is taken from (https://www.cs.cornell.edu/people/pabo/moviereview-data/ You can look here to get information about Dennis Schwartz). This data consists of different txt files.
Example: Content of a file in this folder (16748.txt)
Each of files is about only a movie review. These files are supposed to be in a folder which is named film. You are expected to read these files one by one from the folder. These files count can be changed, so you must read them in a loop.
This is a tab separated list of
user id movie id rating timestamp.
Fall 2016 BBM103: Introduction to Programming Laboratory 1 T.A. : Res. Assist. (Necva BOLUCU, Selma DILEK, Burcu YALCINER, Selim YILMAZ)
5
Example: film folder
Step 3: Combine the GroupLens’ data and Dennis Schwartz’s data
In order to create html files for movies, you must combine the datasets. You are expected to create html files for the movies which are in film folder. In this step, we expected to use list comprehensions.
Firstly, you compare the both dataset (movies in film folder and u.item) and select the movies which are in both datasets. You will create review.txt file to write messages for movies which are in u.item but not in film folder and movies which are found in folder. Use user-defined exception to take messages.
Example: review.txt
After selecting movies, you will find user ids who rate them from u.data and get detail information about these users from u.user.
Fall 2016 BBM103: Introduction to Programming Laboratory 1 T.A. : Res. Assist. (Necva BOLUCU, Selma DILEK, Burcu YALCINER, Selim YILMAZ)
6
Step 4: Write review to html file
When you extract information from given data for movies, you are going to use this data to create html files which are located in filmList folder. In html file, the necessary fields are shown;
! The file name is must be the film id which are given u.item.
<html