Solved ECE1508: Deep Generative Models Summer 2025 Assignment 1: Text Generation and Language Modeling

$30.00

Original Work ?

Download Details:

  • Name: Asgn1-ylurty.zip
  • Type: zip
  • Size: 358.84 KB

Category: Tags: , , You will Instantly receive a download link upon Payment||Click Original Work Button for Custom work

Description

5/5 - (1 vote)

Question 1 [25 Points] (Basic Context-Aware LM) In this question, we implement a basic context-aware
language model. For simplicity, we use character-level tokenization. Each token xt
is embedded by an
embedding of size E. The context is then built by averaging all previous embeddings. Denoting the
embedding at time i by ei
, the context at time t is computed as
ct =
1
t
X
t
i=1
ei
.
The logits for the output probability distribution is then computed by a linear layer W ∈ RI×E, where I
denotes the vocabulary size. This means that the final layer computes
zt = Wct
,
and passes it through a softmax (·) to compute the distribution of next token.
1. Complete Question 1 in Lastname_Firstname_Asgn1.ipynb to implement this language model.
2. Answer all parts marked by # COMPLETE .
Question 2 [25 Points] (BERT Model) In this question, we work with the pretrained BERT model [1].
BERT is a pretrained transformer-based language model, which uses bidirectional (i.e., without masked
decoding) attention computations to compute context. We can however use it for text generation by
using the special token [MASK] as illustrated in the assignment notebook. In this question, we first use
the pretrained BERT to complete a text. We then selectively fine-tune BERT for text classification on a
small dataset collected from IMDB dataset. For fine-tuning, we use the low rank adaptation (LoRA)
method. Details of each part are given in the assignment notebook.
1. Complete Question 2 in Lastname_Firstname_Asgn1.ipynb to use BERT of text completion and
classification.
2. Answer all parts marked by # COMPLETE .
References
[1] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep
bidirectional transformers for language understanding. In Proc. Conference of the Association for
Computational Linguistics: Human Language Technologies (North American Chapter), pages 4171–4186,
2019.
[2] Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal,
Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. Retrieval-augmented generation
for knowledge-intensive nlp tasks. In Advances in Neural Information Processing Systems (NeurIPS),
volume 33, pages 9459–9474, 2020.