Description

5/5 - (1 vote)

Part 1: Transformers
Task 1 (30 points): In this task you should work with the Facebook BART model
(https://huggingface.co/docs/transformers/en/model_doc/bart) to provide text summarization
of news articles. Text summarization in Natural Language Processing (NLP) is a technique that
breaks down long texts into sentences or paragraphs, while retaining the text’s meaning and
extracting important information. Pick any one dataset of your choice.
You may have to do data cleaning, preprocessing etc. Next, perform the following tasks:
1. Provide a description of the dataset you selected. Split your data into train-test set with
a (90-10) split.
2. Load the model from Hugging Face’s Transformers library and write its training script.
3. Fine tune the pre-trained model with your data and report results on your test set. You
must report the BLEU and ROUGE Scores. (See the code provided in class for more
details)
4. Analyze your results and discuss the impact of hyperparameters. Are your results
impacted by the choice of the LLM here? How?
Part 2: Reinforcement Learning
Task 2(20 points): We discussed how we can formulate RL problems as an MDP. Describe any
real-world application that can be formulated as an MDP. Describe the state space, action
space, transition model, and rewards for that problem. You do not need to be precise in the
description of the transition model and reward (no formula is needed). Qualitative description
is enough.
Task 3(20 points): RL is used in various sectors – Healthcare, recommender systems and trading
are a few of those. Pick one of the three areas. Explain one of the problems in any of these
domains that can be more effectively solved by reinforcement learning. Find an open-source
project (if any) that has addressed this problem. Explain this project in detail.
Task 4 is for 6000 level ONLY
Task 4(100 points): Implement the game of tic-tac-toe (write a class that implements an agent
playing Tic Tac Toe and learning its Q function) using the Q-learning technique (see the
resources/links provided in class for more details). Clearly describe your evaluation metric and
demonstrate a few runs. You might need to use some online resources to proceed with this. Do
not forget to cite those.
Part 3: Recommender Systems
Task 5 (30 points): For this task use the MovieLens 100k dataset
(https://grouplens.org/datasets/movielens/100k/)
Perform the necessary data cleaning, EDA and conversion to User-item matrix.
Implement any 2 collaborative filtering recommendation systems (RecSys) algorithms covered
in class (Matrix Factorization, Alternating Least Squares, NCF etc.) and compare their
performance for any 2-evaluation metrics used for RecSys. You may read literature to find out
which evaluation metrics are used for RecSys. Cite all your research.

Solved CSCI 4170 Homework 6

Download Details:

Description

Solved CSCI 4170 Homework 6

Download Details:

Description

Related products

Solved CSCI 4170 Homework 5 (100 points) CNNs, AEs, GANs, Attention Mechanism

Solved CSCI 4170 Homework 4 (100 points) Sequence Models

Solved CSCI 4170 Homework 2 (100 points) Ensemble Learning