## Description

Introduction In this programming assignment, you will practice implementing priority queues and disjoint sets and using them to solve algorithmic problems. In some cases you will just implement an algorithm from the lectures, while in others you will need to invent an algorithm to solve the given problem using either a priority queue or a disjoint set union. Recall that starting from this programming assignment, the grader will show you only the first few tests (see the questions 5.4 and 5.5 in the FAQ section).

Learning Outcomes Upon completing this programming assignment you will be able to: 1. Apply priority queues and disjoint sets to solve the given algorithmic problems. 2. Convert an array into a heap. 3. Simulate a program which processes a list of jobs in parallel. 4. Simulate a sequence of merge operations with tables in a database.

Passing Criteria: 2 out of 3 Passing thisprogramming assignmentrequires passingat least2out of3code problemsfrom thisassignment. In turn, passing a code problem requires implementing a solution that passes all the tests for this problem in the grader and does so under the time and memory limits specified in the problem statement.

Contents 1 Problem: Convert array into heap 3

2 Problem: Parallel processing 5

3 Problem: Merging tables 7

1

4 General Instructions and Recommendations on Solving Algorithmic Problems 10 4.1 Reading the Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.2 Designing an Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.3 Implementing Your Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.4 Compiling Your Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.5 Testing Your Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.6 Submitting Your Program to the Grading System . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.7 Debugging and Stress Testing Your Program . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

5 Frequently Asked Questions 13 5.1 I submit the program, but nothing happens. Why? . . . . . . . . . . . . . . . . . . . . . . . . 13 5.2 I submit the solution only for one problem, but all the problems in the assignment are graded. Why? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 5.3 What are the possible grading outcomes, and how to read them? . . . . . . . . . . . . . . . . 13 5.4 How to understand why my program fails and to fix it? . . . . . . . . . . . . . . . . . . . . . 14 5.5 Why do you hide the test on which my program fails? . . . . . . . . . . . . . . . . . . . . . . 14 5.6 My solution does not pass the tests? May I post it in the forum and ask for a help? . . . . . 15 5.7 My implementation always fails in the grader, though I already tested and stress tested it a lot. Would not it be better if you give me a solution to this problem or at least the test cases that you use? I will then be able to fix my code and will learn how to avoid making mistakes. Otherwise, I do not feel that I learn anything from solving this problem. I am just stuck. . . 15

2

1 Problem: Convert array into heap Problem Introduction In this problem you will convert an array of integers into a heap. This is the crucial step of the sorting algorithmcalledHeapSort. Ithasguaranteedworst-caserunningtimeof O(nlogn) asopposedtoQuickSort’s average running time of O(nlogn). QuickSort is usually used in practice, because typically it is faster, but HeapSort is used for external sort when you need to sort huge files that don’t fit into memory of your computer.

Problem Description Task. The first step of the HeapSort algorithm is to create a heap from the array you want to sort. By the way, did you know that algorithms based on Heaps are widely used for external sort, when you need to sort huge files that don’t fit into memory of a computer? Your task is to implement this first step and convert a given array of integers into a heap. You will do that by applying a certain number of swaps to the array. Swap is an operation which exchanges elements ai and aj of the array a for some i and j. You will need to convert the array into a heap using only O(n) swaps, as was described in the lectures. Note that you will need to use a min-heap instead of a max-heap in this problem. Input Format. Thefirstlineoftheinputcontainssingleinteger n. Thenextlinecontains n space-separated integers ai. Constraints. 1 ≤ n ≤ 100 000; 0 ≤ i,j ≤ n−1; 0 ≤ a0,a1,…,an−1 ≤ 109. All ai are distinct. Output Format. The first line of the output should contain single integer m — the total number of swaps. m must satisfy conditions 0 ≤ m ≤ 4n. The next m lines should contain the swap operations used to convert the array a into a heap. Each swap is described by a pair of integers i,j — the 0-based indices of the elements to be swapped. After applying all the swaps in the specified order the array must become a heap, that is, for each i where 0 ≤ i ≤ n−1 the following conditions must be true: 1. If 2i + 1 ≤ n−1, then ai < a2i+1. 2. If 2i + 2 ≤ n−1, then ai < a2i+2. Note that all the elements of the input array are distinct. Note that any sequence of swaps that has length at most 4n and after which your initial array becomes a correct heap will be graded as correct. Time Limits. C: 1 sec, C++: 1 sec, Java: 3 sec, Python: 3 sec. C#: 1.5 sec, Haskell: 2 sec, JavaScript: 3 sec, Ruby: 3 sec, Scala: 3 sec. Memory Limit. 512Mb. Sample 1. Input: 5 5 4 3 2 1 Output: 3 1 4 0 1 1 3 Explanation: After swapping elements 4 in position 1 and 1 in position 4 the array becomes 5 1 3 2 4.
3
After swapping elements 5 in position 0 and 1 in position 1 the array becomes 1 5 3 2 4. After swapping elements 5 in position 1 and 2 in position 3 the array becomes 1 2 3 5 4, which is already a heap, because a0 = 1 < 2 = a1,a0 = 1 < 3 = a2,a1 = 2 < 5 = a3,a1 = 2 < 4 = a4. Sample 2. Input: 5 1 2 3 4 5 Output: 0 Explanation: The input array is already a heap, because it is sorted in increasing order.
Starter Files There are starter solutions only for C++, Java and Python3, and if you use other languages, you need to implement solution from scratch. Starter solutions read the array from the input, use a quadratic time algorithm to convert it to a heap and use Θ(n2) swaps to do that, then write the output. You need to replace the Θ(n2) implementation with an O(n) implementation using no more than 4n swaps to convert the array into heap.
What to Do Change the BuildHeap algorithm from the lecture to account for min-heap instead of max-heap and for 0-based indexing.
Need Help? Ask a question or see the questions asked by other learners at this forum thread.
4
2 Problem: Parallel processing Problem Introduction In this problem you will simulate a program that processes a list of jobs in parallel. Operating systems such as Linux, MacOS or Windows all have special programs in them called schedulers which do exactly this with the programs on your computer.
Problem Description Task. You have a program which is parallelized and uses n independent threads to process the given list of m jobs. Threads take jobs in the order they are given in the input. If there is a free thread, it immediately takes the next job from the list. If a thread has started processing a job, it doesn’t interrupt or stop until it finishes processing the job. If several threads try to take jobs from the list simultaneously, the thread with smaller index takes the job. For each job you know exactly how long will it take any thread to process this job, and this time is the same for all the threads. You need to determine for each job which thread will process it and when will it start processing. Input Format. The first line of the input contains integers n and m. The second line contains m integers ti — the times in seconds it takes any thread to process i-th job. The times are given in the same order as they are in the list from which threads take jobs. Threads are indexed starting from 0. Constraints. 1 ≤ n ≤ 105; 1 ≤ m ≤ 105; 0 ≤ ti ≤ 109. Output Format. Output exactly m lines. i-th line (0-based index is used) should contain two spaceseparated integers — the 0-based index of the thread which will process the i-th job and the time in seconds when it will start processing that job. Time Limits. C: 1 sec, C++: 1 sec, Java: 4 sec, Python: 6 sec. C#: 1.5 sec, Haskell: 2 sec, JavaScript: 6 sec, Ruby: 6 sec, Scala: 6 sec. Memory Limit. 512Mb. Sample 1. Input: 2 5 1 2 3 4 5 Output: 0 0 1 0 0 1 1 2 0 4 Explanation:
1. The two threads try to simultaneously take jobs from the list, so thread with index 0 actually takes the first job and starts working on it at the moment 0. 2. The thread with index 1 takes the second job and starts working on it also at the moment 0. 3. After 1 second, thread 0 is done with the first job and takes the third job from the list, and starts processing it immediately at time 1. 4. One second later, thread 1 is done with the second job and takes the fourth job from the list, and starts processing it immediately at time 2.
5
5. Finally, after 2 more seconds, thread 0 is done with the third job and takes the fifth job from the list, and starts processing it immediately at time 4. Sample 2. Input: 4 20 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Output: 0 0 1 0 2 0 3 0 0 1 1 1 2 1 3 1 0 2 1 2 2 2 3 2 0 3 1 3 2 3 3 3 0 4 1 4 2 4 3 4
Explanation: Jobs are taken by 4 threads in packs of 4, processed in 1 second, and then the next pack comes. This happens 5 times starting at moments 0, 1, 2, 3 and 4. After that all the 5×4 = 20 jobs are processed. Starter Files The starter solutions for C++, Java and Python3 in this problem read the input, apply an Θ(n2) algorithm to solve the problem and write the output. You need to replace the Θ(n2) algorithm with a faster one. If you use other languages, you need to implement the solution from scratch.
What to Do Think about the sequence of events when one of the threads becomes free (at the start and later after completing some job). How to apply priority queue to simulate processing of these events in the required order? Remember to consider the case when several threads become free simultaneously. Beware of integer overflow in this problem: use type long long in C++ and type long in Java wherever the regular type int can overflow given the restrictions in the problem statement.
Need Help? Ask a question or see the questions asked by other learners at this forum thread.
6
3 Problem: Merging tables Problem Introduction In this problem, your goal is to simulate a sequence of merge operations with tables in a database.
Problem Description Task. There are n tables stored in some database. The tables are numbered from 1 to n. All tables share the same set of columns. Each table contains either several rows with real data or a symbolic link to another table. Initially, all tables contain data, and i-th table has ri rows. You need to perform m of the following operations: 1. Consider table number destinationi. Traverse the path of symbolic links to get to the data. That is,
while destinationi contains a symbolic link instead of real data do destinationi ←symlink(destinationi) 2. Consider the table number sourcei and traverse the path of symbolic links from it in the same manner as for destinationi. 3. Now, destinationi and sourcei are the numbers of two tables with real data. If destinationi ̸= sourcei, copy all the rows from table sourcei to table destinationi, then clear the table sourcei and instead of real data put a symbolic link to destinationi into it. 4. Print the maximum size among all n tables (recall that size is the number of rows in the table). If the table contains only a symbolic link, its size is considered to be 0. See examples and explanations for further clarifications. Input Format. The first line of the input contains two integers n and m — the number of tables in the database and the number of merge queries to perform, respectively. The second line of the input contains n integers ri — the number of rows in the i-th table. Then follow m lines describing merge queries. Each of them contains two integers destinationi and sourcei — the numbers of the tables to merge. Constraints. 1 ≤ n,m ≤ 100 000; 0 ≤ ri ≤ 10 000; 1 ≤ destinationi,sourcei ≤ n. Output Format. For each query print a line containing a single integer — the maximum of the sizes of all tables (in terms of the number of rows) after the corresponding operation. Time Limits. C: 2 sec, C++: 2 sec, Java: 14 sec, Python: 6 sec. C#: 3 sec, Haskell: 4 sec, JavaScript: 6 sec, Ruby: 6 sec, Scala: 14 sec. Memory Limit. 512Mb.
7
Sample 1. Input: 5 5 1 1 1 1 1 3 5 2 4 1 4 5 4 5 3 Output: 2 2 3 5 5 Explanation: In this sample, all the tables initially have exactly 1 row of data. Consider the merging operations: 1. All the data from the table 5 is copied to table number 3. Table 5 now contains only a symbolic link to table 3, while table 3 has 2 rows. 2 becomes the new maximum size. 2. 2 and 4 are merged in the same way as 3 and 5. 3. We are trying to merge 1 and 4, but 4 has a symbolic link pointing to 2, so we actually copy all the data from the table number 2 to the table number 1, clear the table number 2 and put a symbolic link to the table number 1 in it. Table 1 now has 3 rows of data, and 3 becomes the new maximum size. 4. Traversing the path of symbolic links from 4 we have 4 → 2 → 1, and the path from 5 is 5 → 3. So we are actually merging tables 3 and 1. We copy all the rows from the table number 1 into the table number 3, and now the table number 3 has 5 rows of data, which is the new maximum. 5. All tables now directly or indirectly point to table 3, so all other merges won’t change anything. Sample 2. Input: 6 4 10 0 5 0 3 3 6 6 6 5 5 4 4 3 Output: 10 10 10 11
Explanation: In this example tables have different sizes. Let us consider the operations: 1. Merging the table number 6 with itself doesn’t change anything, and the maximum size is 10 (table number 1).
8
2. After merging the table number 5 into the table number 6, the table number 5 is cleared and has size 0, while the table number 6 has size 6. Still, the maximum size is 10. 3. By merging the table number 4 into the table number 5, we actually merge the table number 4 into the table number 6 (table 5 now contains just a symbolic link to table 6), so the table number 4 is cleared and has size 0, while the table number 6 has size 6. Still, the maximum size is 10. 4. By merging the table number 3 into the table number 4, we actually merge the table number 3 into the table number 6 (table 4 now contains just a symbolic link to table 6), so the table number 3 is cleared and has size 0, while the table number 6 has size 11, which is the new maximum size.
Starter Files The starter solutions in C++, Java and Python3 read the description of tables and operations from the input, declare and partially implement disjoint set union, and write the output. You need to complete the implementation of disjoint set union for this problem. If you use other languages, you will have to implement the solution from scratch.
What to Do Thinkhowtousedisjointsetunionwithpathcompressionandunionbyrankheuristicstosolvethisproblem. In particular, you should separate in your thinking the data structure that performs union/find operations from the merges of tables. If you’re asked to merge first table into second, but the rank of the second table is smaller than the rank of the first table, you can ignore the requested order while merging in the Disjoint Set Union data structure and join the node corresponding to the second table to the node corresponding to the first table instead in you Disjoint Set Union. However, you will need to store the number of the actual second table to which you were requested to merge the first table in the parent node of the corresponding Disjoint Set, and you will need an additional field in the nodes of Disjoint Set Union to store it.
Need Help? Ask a question or see the questions asked by other learners at this forum thread.
9
4 General Instructions and Recommendations on Solving Algorithmic Problems Your main goal in an algorithmic problem is to implement a program that solves a given computational problem in just few seconds even on massive datasets. Your program should read a dataset from the standard input and write an answer to the standard output. Below we provide general instructions and recommendations on solving such problems. Before reading them, go through readings and screencasts in the first module that show a step by step process of solving two algorithmic problems: link.
4.1 Reading the Problem Statement You start by reading the problem statement that contains the description of a particular computational task as well as time and memory limits your solution should fit in, and one or two sample tests. In some problems your goal is just to implement carefully an algorithm covered in the lectures, while in some other problems you first need to come up with an algorithm yourself.
4.2 Designing an Algorithm If your goal is to design an algorithm yourself, one of the things it is important to realize is the expected running time of your algorithm. Usually, you can guess it from the problem statement (specifically, from the subsection called constraints) as follows. Modern computers perform roughly 108–109 operations per second. So, if the maximum size of a dataset in the problem description is n = 105, then most probably an algorithm with quadratic running time is not going to fit into time limit (since for n = 105, n2 = 1010) while a solution with running time O(nlogn) will fit. However, an O(n2) solution will fit if n is up to 103 = 1000, and if n is at most 100, even O(n3) solutions will fit. In some cases, the problem is so hard that we do not know a polynomial solution. But for n up to 18, a solution with O(2nn2) running time will probably fit into the time limit. To design an algorithm with the expected running time, you will of course need to use the ideas covered in the lectures. Also, make sure to carefully go through sample tests in the problem description.
4.3 Implementing Your Algorithm When you have an algorithm in mind, you start implementing it. Currently, you can use the following programming languages to implement a solution to a problem: C, C++, C#, Haskell, Java, JavaScript, Python2, Python3, Ruby, Scala. For all problems, we will be providing starter solutions for C++, Java, and Python3. If you are going to use one of these programming languages, use these starter files. For other programming languages, you need to implement a solution from scratch.
4.4 Compiling Your Program For solving programming assignments, you can use any of the following programming languages: C, C++, C#, Haskell, Java, JavaScript, Python2, Python3, Ruby, and Scala. However, we will only be providing starter solution files for C++, Java, and Python3. The programming language of your submission is detected automatically, based on the extension of your submission. We have reference solutions in C++, Java and Python3 which solve the problem correctly under the given restrictions, and in most cases spend at most 1/3 of the time limit and at most 1/2 of the memory limit. You can also use other languages, and we’ve estimated the time limit multipliers for them, however, we have no guarantee that a correct solution for a particular problem running under the given time and memory constraints exists in any of those other languages. Your solution will be compiled as follows. We recommend that when testing your solution locally, you use the same compiler flags for compiling. This will increase the chances that your program behaves in the
10
same way on your machine and on the testing machine (note that a buggy program may behave differently when compiled by different compilers, or even by the same compiler with different flags). ∙ C (gcc 5.2.1). File extensions: .c. Flags: gcc -pipe -O2 -std=c11