CS2010 Lab 01 – About Me – Attendance



CS2010 Lab 01 – About Me – Attendance

0 0


CS2010TA

CS2010 TA Slides

On Github yyjhao / CS2010TA

CS2010 Lab 01

About Me

  • Yao Yujian
  • yyjhao@gmail.com
  • Please indicate [CS2010] in the title of your email

Attendance

Coursemology

  • http://coursemology.com/courses/20
  • Download the PS1 Files
  • Do not need to submit to Coursemology
  • Coursemology is for
    • Downloading PS packages
    • Viewing your scores and achievements
  • Submit to Mooshak

Mooshak

  • http://algorithmics.comp.nus.edu.sg/~mooshak
  • Password has been sent to you by email
  • Automatic and instant grading!
    • AC(cepted)
    • Wrong Answer
    • Time Limit Exceeded
    • Run Time Error
    • Compile Time Error
  • Unless there are specific constraints (like PS1 Subtask 3/ Problem C), if you get your code AC, you will get that amount of marks as stated in the problem description
  • However, post-deadline penalty (e.g. your code are found to be a very similar copy of someone else’s code) can still alter the score

The Problem Sets

  • Subtask 1 is always the easiest, but low marks
  • Subtask 2 (or also 3) is/are CS2010 standard, medium marks
  • The last (or the R) Subtask is quite challenging, but low marks
## PS1 - The Baby Names - Subtask 1/Problem A - How many names start with a certain letter? - Subtask 2+3/Problem B+C - How many names start with a certain prefix? - Problem B and C have the same test data - Problem B has 10s time limit but problem C only has 1s (STRICT) - For problem C, you also need to write your own **balanced** BST++! - Subtask R/Problem D/Very Hard/CS2010R: - How many names have a certain substring?
## Week 0 UVa problems 1. UVa 579 (to revise your Java skill) 2. UVa 10855 (to revise your knowledge about Array) 3. UVa 11988 (to revise your knowledge about Linked List) 4. UVa 11111 (to revise your knowledge about Stack) 5. UVa 10901 (to revise your knowledge about Queue) 6. UVa 10258 (to revise your knowledge about Sorting) 7. UVa 11340 (to revise your knowledge about Hashing)
## Focus on [UVa 10901](http://uva.onlinejudge.org/index.php?option=onlinejudge&page=show_problem&problem=1842)
## That's all - More hints on solving PS1 will be given next week - But it is good if you try them first - Even better: Solve them without those extra hints - Remember: If you keep delaying your first attempt for PS1, you may run out of time even though you virtually have 2 weeks working time for PS1(23 Aug to 7 Sep)
# CS2010 Lab 02
## `TreeSet` and `TreeMap` * Both uses the same underlying data structure BBST * http://docs.oracle.com/javase/6/docs/api/java/util/TreeMap.html * http://docs.oracle.com/javase/6/docs/api/java/util/TreeSet.html * Feel Free to Explore other methods

Case study: Indexing City Names

  • Given some cities
  • Connect some of them
TreeMap<String, Integer> nameToNum = new TreeMap<String, Integer>();
if(!nameToNum.containsKey(name)){
    nameToNum.put(name, nameToNum.size());
}
						
## Why check the API? * There is one method in TreeMap and TreeSet that can be very very useful for PS1 Subtask 2 * Method subMap(fromKey, toKey) in TreeMap * Method subSet(fromKey, toKey) in TreeSet * Which one to use for PS1 Subtask 2?
## Subtask 3 * Additional constraint * Need to emulate subSet(fromKey, toKey) of TreeSet *efficiently* * First, you BST has to be balanced * Second, your “subSet” method has to run in O(log n) * many students still have O(n) here * You have 1.5 more days before PS1 deadline \_(:3J∠)_
## Subtask R * If you are interested... * Not tested in CS2010 * Forget about BBST * String Processing, Bioinformatics * Read the book!
## PS2 Overview Priority Queue
## Q ? A : End();
# CS2010 E-Lab 03 ## Read this on your own
## About PS1 * Bugs in rotations, insertion, balancing * If your C code fail B, this is likely the cause * Remember to re-assign the return value of your rotation to the left/right * Because after rotation, the parent node now have a different child node * TLE for C * Likely because your algorithm runs in O(N) * Because it traverses to count all nodes in the range * Need to reduce the runtime to O(log(N)) * **However**! The test cases on Mooshak is weak... * Some manage to get AC even with O(N) algorithm...
## (Intended) Solution for Subtask C * You don't need to go through every node in the range! * Write a Balanced BST routine * e.g. AVL Tree (or something else if you are pro) * Make sure the rotateLeft/rotateRight operations and all the 4 cases during insertion are handled without bug (use PS1 Subtask 2 to check) * Notice that AVL Tree deletion is not needed * Try implementing it if you want to... * Augment this BST with “size” attribute * so that you can get a “rank” of a certain vertex in O(1) after you search it in O(log n) * See the next slide for (an almost complete) solution * Then, the answer is rank(upperbound)-rank(lowerbound) * **We split the boys and girls baby names into TWO bBSTs!**

PS2

PriorityQueue

// copy and paste this to your text editor for better reading experience
// credit: ch2_06_priority_queue.java from CP3 book
import java.util.*;

class pair < X, Y > { // utilizing Java "Generics"
  X _first;
  Y _second;

  public pair(X f, Y s) { _first = f; _second = s; }

  X first() { return _first; }
  Y second() { return _second; }

  void setFirst(X f) { _first = f; }
  void setSecond(Y s) { _second = s; }
}

class ch2_06_priority_queue {
  public static void main(String[] args) {
    // introducing 'pair'
    // take note of the use of comparator
    PriorityQueue < pair < Integer, String > > pq = new PriorityQueue < pair < Integer, String > >(1, 
      new Comparator< pair < Integer, String > >() { // overriding the compare method
        public int compare(pair < Integer, String > i, pair < Integer, String > j) {
          return j.first() - i.first(); // currently max heap, reverse these two to try produce min-heap
        }
      }
    );

    // suppose we enter these 7 money-name pairs below
    /*
    100 john
    10 billy
    20 andy
    100 steven
    70 felix
    2000 grace
    70 martin
    */
    pq.offer(new pair < Integer, String > (100, "john")); // inserting a pair in O(log n)
    pq.offer(new pair < Integer, String > (10, "billy"));
    pq.offer(new pair < Integer, String > (20, "andy"));
    pq.offer(new pair < Integer, String > (100, "steven"));
    pq.offer(new pair < Integer, String > (70, "felix"));
    pq.offer(new pair < Integer, String > (2000, "grace"));
    pq.offer(new pair < Integer, String > (70, "martin"));
    // this is how we use Java PriorityQueue
    // priority queue will arrange items in 'heap' based
    // on the first key in pair, which is money (integer), largest first
    // if first keys tied, use second key, which is name, largest first
  
    // the internal content of pq heap MAY be something like this:
    // re-read (max) heap concept if you do not understand this diagram
    // the primary keys are money (integer), secondary keys are names (string)!
    //                        (2000,grace)
    //           (100,steven)               (70,martin)   
    //     (100,john)   (10,billy)     (20,andy)  (70,felix)

    // let's print out the top 3 person with most money
    pair<Integer, String> result = pq.poll(); // O(1) to access the top / max element + O(log n) removal of the top and repair the structure
    System.out.println(result.second() + " has " + result.first() + " $");
    result = pq.poll();
    System.out.println(result.second() + " has " + result.first() + " $");
    result = pq.poll();
    System.out.println(result.second() + " has " + result.first() + " $");
  }
}

## E-Mission 1. `goto` http://rosemarietan.com/fyp/heap.html 2. You will see a default Max Heap of this 1-based compact array {n/a, 90, 19, 36, 17, 3, 25, 1, 2, 7} 3. Can you convert the same set of integers from the default Max Heap into a Min Heap? * Without changing anything else from this visualization! 4. One of you can Post the answer in [FB group](https://www.facebook.com/groups/241724769269875/), the rest can see the answer there
## Solution for Subtask 1 * Obviously PQ * You can use the `PriorityQueue` shown just now 1. Implement a `woman` object * **Important note**: Real life woman is *NOT* an object! 2. This `woman` class has to implement `interface Comparable` 3. Write a `compareTo` method * http://docs.oracle.com/javase/6/docs/api/java/lang/Comparable.html 4. Done.
## How about Subtasks 2 and 3? * Because `UpdateDilation()` * This requires ability to modify a key inside the Binary Heap where this key can be anywhere in the Binary Heap * not necessarily in the root – the easiest place * This operation is sometimes called as `heapUpdateKey()` * To do this efficiently, we need something else not yet taught in Lecture 04, which is left for students to explore on their own or discuss among each other * detailed solutions by next week, after PS2 is due \_(:3J∠)_
## Hacking Solution for Subtask 2 * We know UpdateDilation() can make things difficult * But in Subtask 2, N ≤ 10 * You can just use an array of size 10 and keep re-sorting the positions of up to 10 women for every `UpdateDilation()` * This way, if done correctly, can give you at least 75 marks * You have 1.5 more days to do this to salvage some marks
## PS3 Overview * PS3 is already out since Sunday 08 Sep 2013, 8pm * It is about Graph Data Structure and Graph Traversal * Given a graph * Find the lowest rating 'important' vertex * Important vertex - if removed, graph is disconnected * Rating is given directly * Expect O(V x V + V x E) * How to check if a graph is disconnected? * More hints next week ;P
# CS2010 Lab 04 ## And Online Quiz 1 (Trial) * Log in to your Computer * And launch Chrome or Firefox to prepare for the quiz * while I talk
## PS2 Debrief - Common Mistakes * Mostly OK * TLE in C: Likely using `remove()` method in `PriorityQueue` for GiveBirth or any other O(n) like searching the entire PQ for a woman of a certain name > Implementation note: this implementation provides > O(log(n)) time for the enqueing and dequeing methods > (offer, poll, remove() and add); **linear time** for the > remove(Object) and contains(Object) methods; and > constant time for the retrieval methods (peek, > element, and size). * Again, the test case is not very strong and some managed to pass...
## Expected solution(s) * Easiest: Use more than one bBSTs * One bBST to map woman name to dilation and arrival time * Another bBST to emulate the PQ, you can use Java TreeMap/TreeSet * Arrival is simple insertion * Query is FindMax * GiveBirth is search for that woman and remove it * IncreaseKey: search the woman, delete her old data, reinsert new data * Lazy update using PQ (advanced topic, see CP3) * Longest to code: Write your own Binary Heap class to do `UpdateKey`, use `HashMap` or `TreeMap` to map name to index
## About PS3 * How to identify an important room? * Have we learned an algorithm to check graph connectivity in Lecture 06? * What is the major difference between Subtask 2 and Subtask 3?
## PS3 Subtask 4 and PS Bonus * Mostly for CS2010R students * **You may lose your recess week if you attempt them :D**

The Online Quiz 1

  • Your chance for 2% BONUS POINTS!!!
  • URL:
  • http://algorithmics.comp.nus.edu.sg /weneedaname/theexperiment.html
  • This lab group 2-letter PWD prefix: wo
  • UID is your matric number all in UPPERCASE
  • Password is your Mooshak password with the prefix in front
# CS2010 Lab 05
## PS3 Debrief - Common Mistakes * Mostly OK * WA: Trapped in corner cases * TLE in C: need to convert to Adjacency List * TLE in D: Expected. D requires a totoally different algorithm
## About Subtask 4 * It is a classic graph problem * Working codes are searchable * Please **do not** blindly copy and paste them * Some student have admitted this and we rendered their submission invalid
## Solutions * Tarjan’s algorithm for finding articulation points(cut vertices) of an undirected graph in _O(V+E)_ * Hopcroft, J.; Tarjan, R. (1973). "Efficient algorithms for graph manipulation". _Communications of the ACM_ **16** (6): 372–378. * For 1-2-3: * Convert the graph from Adjacency Matrix to Adjacency List * For each vertex, try (virtually) blocking it, then runs O(V+E) DFS/BFS to see if the number of connected components increase (not necessarily to two, but can be more than two)
## PS Bonus * Subtask ABC are actually free marks :D * Model it as a graph problem * And solve it with something you have learned in Lecture 6: BFS
## Today’s Demo: Graph DS Conversion * By now, you have seen at least 3 graph data structures: 1. Adjacency Matrix (AM), e-Lecture 05 1. Adjacency List (AL), e-Lecture 05 1. Edge List (EL), Lecture 07 * Each DS is strong in certain areas but weak in another * There are 6 possibilities: AM to AL, AM to EL, AL to AM, AL to EL, EL to AM, and EL to AL
## Demo: AM to EL * At home, think of the other possibilities that we have not discussed yet
## PS4 Overview * PS4 is already out since Sunday 29 Sep 2013, 8pm * It is about Minimum Spanning Tree++ * The last two subtasks are where the fun lie >:DDDD
## Review of Prim’s and Kruskal’s Code
# CS2010 Lab 06
## About PS4 * Subtask 1: Tree * Subtask 2-4: No Tree * Subtask 5: Large number of query * Subtask 6: 1/5 of run time.
## Graph Modelling Exercise * Unlock the Lock (UVA 12160) * http://uva.onlinejudge.org/external/121/12160.html * Number Maze (UVA 929) * http://uva.onlinejudge.org/external/9/929.html

Unlock the lock

  • Given:
    • L, current lock code, 4 digits integer 0000-9999
    • U, unlock code, also 4 digits integer 0000-9999
    • R, number of buttons, R ≤ 10
      • Followed by Ri numbers: the value of each button
      • Each press of button i transform L to (L + Ri)
  • Problem: Can U be reached from L?
    • If yes, with how many button presses?
    • If no, print “Permanently Locked”

Finding the implicit graph

  • There are max 10000 vertices: 0000-9999!
  • Connect two vertices (L1 and L2) with an edge if L2 = (L1 + Ri) for some button i

  • What is the graph problem?
  • sssp from starting L, output dist[U]
  • if dist[u] is INF, print “Permanently Locked”

  • What is the graph algorithm?
  • O(V+E) BFS, as the graph is unweighted

Number Maze

  • Given:
    • N*M grid of 1-digit integers [0..9] and 1 ≤ N, M ≤ 999
  • Problem: What is the min cost value from TL to BR?
    • TL = top-right corner, BL = bottom-right corner

0

3

1

2

9

7

3

4

9

9

1

7

5

5

3

2

3

4

2

5

implicit graph

  • There are max 999 * 999 = 998001 1M vertices/cells
  • Connect two vertices/cells (a and b) with an edge
    • if they share a common border
    • Give all weight to a vertex by the value of cell in maze
  • sssp from top-left cell to bottom-right cell + maze[0][0]

  • O((V+E) * log V) Dijkstra’s for weighted graph
  • The O(VE) Bellman Ford’s is still too slow

0

3

1

2

9

7

3

4

9

9

1

7

5

5

3

2

3

4

2

5

## PS5 * It is about Single-Source Shortest Paths++ * The last subtask E has a nice twist that was originally meant for Quiz 2, but placed here as it may be ‘too hard’ for a 77 minutes Quiz 2
# CS2010 Lab 07
## PS4 Debrief * All of you cleared ABCD! * This is the "minimum standard" * Common problems: TLE for E and/or F
## Subtask E * Pre-compute all possible queries from source vertex 0-9 to all other vertices * by running `DFS` (or `BFS`) 10 times * after you obtained the `MST` * Store the query results in a 2D array of answers of size `10*V` * Using hash table or map (bBST) is possible but pointless * This is an algorithmic improvement: * Trading memory (2D array of answers) for `O(1)` query speed * This important key concept will be revisited in **Dynamic Programming**!
## Subtask F * Subtask 5 solution from previous slide * KEY STEP: Improve the slow `Scanner` and `println` * with `BufferedReader` + `PrintWriter` * This is constant factor implementation tweak * This has been used in PS1 skeleton code :D * How to use `BufferedReader` properly? * Standard: 1. call `readLine()` method 1. use `StringTokenizer` 1. then use `Integer.parseInt()` method * Better: * Use class `IntegerScanner` (this is part of **Ian Leow’s** code)
class IntegerScanner { // only work for all integer input
  BufferedInputStream bis;
  IntegerScanner(InputStream is) {
    bis = new BufferedInputStream(is, 1000000);
  }
  
  public int nextInt() {    
    int result = 0;
    try {
      int cur = bis.read(); // use read instead of readline
      if (cur == -1)
        return -1;
      while (cur < 48 || cur > 57)
        cur = bis.read();
      while (cur >= 48 && cur <= 57) { // ASCII values of 0 to 9
        result = result * 10 + (cur - 48);
        cur = bis.read();
      }
      return result;
    }
    catch(IOException ioe) {
      return -1;
} } }
						
## Subtask F * Speed up the output as we have **100K** queries * Standard: * Replace `System.out.println` * with `pr.println` in class `PrintWriter` * do not forget to call pr.close() at the end * Slightly better: * Combine StringBuilder with PrintWriter
## Subtask F * Minor enhancements (will not matter much): * Only run DFS/BFS from source s when asked… not always from [0..9] * On some small test cases V is < 10; * On some test cases, Steven only ask from source s = 0 * Kruskal’s user only * In `OutForAWalkVerifier.java`, there is a check that the edge weight of test data is `[0..1000]` * This can actually be exploited for Kruskal’s user * use Counting Sort to sort the edges * so that Kruskal’s runs in `O(E)` instead of `O(E log V)` * Learn Counting Sort on your own, it is in CS3230 syllabus, not here in CS2010
## Some more words * 88 students (a few TAs there) solved PS4 ABCDEF * 119 students (a few TAs there) solved PS4 ABCDE * This is much better than what Steven expected * So, praise your hard work and perseverance! * Good job! * And PS4F will likely be dropped next year \_(:3J∠)_ * Coming next: * PS6 is 'weak' * PS5 is already quite challenging * PS7R is also hard
## PS5 Subtask ABCD * Nothing to hide here, PS5 Subtask 1-4 are thereto force you to code at least one SSSP algorithm: * Subtask A: the graph is a tree * Subtask B: the graph is unweighted (constant weight) * Subtask C: the graph is weighted, VE < 1M * Subtask D: the graph is weighted, V+E < 250K * PS: You cannot assume that the value of V or E is small * There are test cases with V = 100K, E = 100K-1 * i.e. a very large tree with V+E ~= 200K
## PS5 Subtask E * The additional constraint is quite relevant in real life * Google around for recent (taxi) accidents videos in SG * Many happened around junctions * Note that some of those accidents are fatal * so please watch those videos with caution * What to do in order to handle this seemingly simple additional constraint? * (the shortest path cannot have more than 7 vertices on it) * Hint: Related to graph modeling exercise last week! * We will not disclose anything more detailed than that, but you can discuss among each other
## APIO 2013 Stuff * Cool stuffs about SSSP algorithms * Let’s see APIO13_finalversion.pdf, page 7-10 + 12-13 * As we have not learned Floyd Warshall’s yet, * let’s concentrate on **Subtask 5 and Subtask 6**

Modified Bellman Ford's

counter = 0
for each query p(s,t);
  dist[s] = 0; // s is the source vertex
  loop V-1 times
    change = false;
    for each edge (u,v) in L
      increase counter by 1;
      if dist[u] + weight(u,v) < dist[v]
        dist[v] = dist[u] + weight(u,v);
        change = true;
    if change is false // this is the ’optimized’ Bellman Ford
      break from the outermost loop;
  output dist[t];
  						

How to make it run in O(VE)?

## Solution 1. The Optimized Bellman Ford’s has a check inside the outer for-loop * if there is no more edge relaxation, it will immediately stop. 1. In order to force the Optimized Bellman Ford’s to really repeat all E edges relaxation V-1 times, we need to observe the listing of the E edges. 1. The code is written in such a way 1. All outgoing edges of vertex 0 is processed first 1. Then all outgoing edges of vertex 1, ..., until all outgoing edges of vertex V-1. 1. Therefore, we need the source vertex to be vertex V-1.
## Solution 1. We create a graph where we have edges from vertex i to vertex j if j < i. 1. The weight of edge (i, j) is 1 if i-j = 1 or i-j+1 otherwise. 1. This way, all E edges have to be relaxed exactly V−1 times. 1. The Modified Dijkstra’s has no problem with this input graph.

Modified Dijkstra’s

counter = 0;
for each query p(s,t)
  dist[s] = 0; pq.push(pair(0, s)); // pq is a priority queue
  while pq is not empty
    increase counter by 1;
    (d, u) = the top element of pq and then remove it from pq;
    if (d == dist[u]) // that important check
      for each edge (u,v) in L
        if (dist[u] + weight(u,v)) < dist[v]
          dist[v] = dist[u] + weight(u,v);
          insert pair (dist[v], v) into the pq; // lazy
  output dist[t];
  						

How to make this run extremely slowly?

## Solution 1. Modified Dijkstra re-enqueue and re-process new vertex information pair. 1. On a graph with no negative weight edge, such action is rare (but exists) 1. On a graph with negative weight but no negative weight cycle, one can construct a ‘dual path’ graph as shown in the next slide 1. To make Modified Dijkstra’s reprocess many vertices many times (exponentially) 1. whereas the Optimized Bellman Ford’s simply runs in O(VE).

Image courtesy of Francisco Criado, one of Steven's CP book reader from Spain

## PS6 Overview * PS6 is out today (Thursday 17 Oct 2013, 8am) * It is about a very simple DP * The proper lecture on this will only be delivered next week * But you have seen the solution actually (briefly during Lecture 08)… * The last subtask is still very simple (in Steven’s opinion) * It is slightly different from last year’s version 2.0 in order to minimize blind copy paste from last year’s solution * Let’s see PS6.pdf
# CS2010 Lab 08
## PS5 Debrief * Mostly AC in ABCD, no major surprise here * BUT, some who AC problem D forgot to do two things * Possible to make this TLE with a (much) stronger test data * WA (or TLE) in E for various reasons
## Solution * Copy & paste Dijkstra's Algorithm * SPFA Algorithm * Shortest Path Faster Algorithm

What if

PriorityQueue<IntegerPair> pq = new PriorityQueue<IntegerPair>();
pq.offer(new IntegerPair(0, s));
while (!pq.isEmpty()) {
  IntegerPair top = pq.poll();
  int d = top.first(), u = top.second();
  if (u == t) break; // to speed up a bit,
  // what if 0 is very close to 1 but the graph is very gigantic?
  if (d > dist.get(u)) continue; // some of you don’t do this
  // what if there are so many inferior information in the PQ?
  for (int j = 0; j < AdjList.get(u).size(); j++) {
    IntegerPair p = AdjList.get(u).get(j);
    int v = p.first(), weight_u_v = p.second();
    if (dist.get(u) + weight_u_v < dist.get(v)) {
      dist.set(v, dist.get(u) + weight_u_v);
      pq.offer(new IntegerPair(dist.get(v), v));
} } }

						
## PS4 Subtask 5: * Graph modeling * Map each vertex `v` to `(v, vertices_used_so_far)` * Keep track of how many vertices has been used in the shortest path * Then modify your Dijkstra’s implementation accordingly * Wrong implementation can causes various wrong answers * Fortunately Steven allow you to use pseudo-code in written tests
## Implementation Tips * Reorder (dist[v], v, depth[v]) to (dist[v], depth[v], v) * Is the first triple wrong? * Why does this help?
## PS5 DP Formuation * We can classify this problem as DP problem * This is because the transformed graph is actually a DAG * In fact, after you finish Lecture 11, you may want to re-do this problem with DP
## Demo Today * Discussion of one classic algorithm on DAG * UVa 926 - Walking Around Wisely * http://uva.onlinejudge.org/external/9/926.html * This is part of CS2010 final exam question in the past two years * (the final exam question is the simplified version)
## PS6 Ideas * Nothing much :O, classic Longest Path on DAG * Topological Sort + Linear Pass as in Lecture 10 * Even the modified last subtask is also not that hard * You just need to find vertices not inside the critical path * (what is tested is the implementation to do this quickly) * To think about: What if Steven change +1 minute to +X minute(s) where X can be >1?
## PS7 * PS7 is out today (Thursday 24 Oct 2013, 00.00am) * This is about TSP++ * The proper lecture on this will only be delivered next week * The last subtask is quite challenging (in Steven’s opinion) * But maybe not as hard as PS Bonus (again this is subjective feeling)