Category: Tech

  • Georgia Tech OMSCS CS6515 (Graduate Algorithms) Course Review

    Georgia Tech OMSCS CS6515 (Graduate Algorithms) Course Review

    To pass this class, you should

    1. digest everything written in Joves’s notes (he’s a TA and will release these notes gradually throughout the semester so pay close attention to his Piazza posts)
    2. join or form a study group of a handful of students
    3. dedicate at least 20+ hours per week to drill, memorize, and apply algorithms
    4. complete all the homework assignments, (easy) project assignments, and quizzes (these are all easy points and you’ll need them given that exams make up 70% of your final grade)
    5. drill ALL the practice problems (both assigned and extra ones published on the wiki) over and over again until you’ve memorized them

    Almost failing this class

    This class kicked me in the ass. Straight up. Words can barely described how relieved I feel right now; now that the summer term is over, my cortisol levels are finally returning to normal levels.

    I’m not exaggerating when I say I teared up when I learned that I received a passing grade. I barely — and I mean barely (less than 1%) — passed this class with a B, a 71%. Throughout the last week of the summer semester, while waiting for the final grades to be published on Canvas, I had fully prepared myself (both mentally and emotionally) for repeating this class, level of anxiety and stress I haven’t felt throughout the last 3 years in the OMSCS program.

    Other students in the class felt the same level of despair. One other student shared that he has:

    never felt that much pressure and depression from a class in [his] entire academic career.

    One other student definitely did not hold back any punches on Piazza:

    I am going to open up a new thread after this course finishes out. I am tired of the arrogant culture that is in this program and specifically in this course! There is a lack of trying to understand other perspectives and that is critical for creating a thriving diverse intellectual community.

    So yes — this course is difficult.

    All that being said, take my review with a pinch of salt. Other reviewers have mentioned that you just need to “put in the work” and “practice all the assigned and wiki problems”. They’re right. You do need to do both those things.

    But the course may still stress you out; other courses in the program pretty much guarantee that you’ll pass (with an A or B) if you put in x number of hours; this doesn’t apply for GA. You can put in all the hours and still not pass this class.

    Before getting into the exam portion of my review, it’s worth noting that the systems classes I mentioned above play to my strengths as a software engineer building low level systems; in contrast, graduate algorithm predominately focuses on theory and is heavy on the math, a weakness of mine. Another factor is that I’ve never taken an algorithmic course before, so many of the topics were brand spanking new to me. Finally, my mind wasn’t entirely focused on this class given that I had quit my job at FAANG during the first week this class started.

    Okay, enough context. Let’s get into discussing more about the exams.

    Exams

    As mentioned above, do ALL the practice problems (until you can solve them without thinking about it) and really make sure you understand everything in Joves’s notes. I cannot emphasize these two tips enough. You might be okay with just working the assigned practice problems but I highly recommend that you attempt the homework assignments listed on the wiki since questions from the exam seem to mirror (almost exactly) those questions. And again, Joves’s notes are essentially since he structures the answers in the same way they are expected on the exam.

    Exam 1

    Exam 1 consists of 1) dynamic programming and 2) Divide and Conquer (DC)

    Read the dynamic programming (DP) section from the DPV textbook. Practice the dynamic programming problems over and over and over again.

    Attempt to answer all the dynamic programming (DP) problems from both the assigned practice problems and all the problems listed on the wiki. Some other reviewers suggest only practicing a subset of these problems but just cover your bases and practice ALL of practice problems — over and over again, until they become intuitive and until you can (with little to no effort) regurgitate the answers.

    For the divide and conquer question, you MUST provide an optimal solution. If you provide a suboptimal solution, you will be dinged heavily: I answered the question a correct solution but was O(n) and not O(logn), I only lost half the points. A 50%. So, make sure you understand recursion really well.

    Exam 2

    Exam 2 focuses on graph theory. You’ll likely get a DFS/Dijkstra/BFS question and another question that requires you understand spanning trees.

    The instructors want you to demonstrate that you can use the algorithms as black boxes (no need to prove their correctness so you can largely skip over the graph lectures). That is, you must understand when/why to use the algorithms, understand their inputs and outputs, and memorize their runtime complexity.

    For example, given a graph, you need to find out if a path exists from one vertex to another.

    To solve this problem, should know explore algorithm like the back of your hand. You need to know that the algorithm requires both an undirected (or directed) graph and a source vertex as inputs. And the algorithm returns a visited[u] array, each entry set to True if such a path exists.

    That’s just one example. There are many other algorithms (e.g. DFS, BFS, Krushkal’s MST) you need to memorize. Again, see Joves’s notes (recommendation #1 at the top of this page). Seriously, Joves, if you are reading this, thanks again. Without your notes, I would 100% have failed the course.

    Exam 3

    Understand the difference between NP, NP Hard, NP-Complete.

    I cannot speak much to the multiple choice question (MCQ) since I bombed this part of the exam. But I did relatively well on the single free-form question, again, thanks to Joves’s notes. Make sure that you 1) Prove that a problem is in NP (i.e. solution can be verified in polynomial time) and 2) You can reduce a known NP-Complete problem to this new problem (in that order — DO NOT do this backwards and lose all the points).

    Summary

    Some students will cruise this class. You’ll see them on Piazza and Slack, celebrating their near perfect scores. Don’t let that discourage you. Most of students find this topic extremely challenging.

    So just brace yourself: it is a difficult course. Put the work in. You’ll do fine. And I’ll be praying for you.

     

  • Graph theory and upcoming final exam

    I just scheduled my final exam for my discrete mathematics course, now that I submitted my homework for the final assignment covering graph theory. Throughout this assignment, I was introduced to a variety of concepts: Leonard Euler and his discovery of Euler paths and circuits, Hamiltonian paths and circuits, and a handful of graph related terminology (e.g. vertex, edges, degrees, bridge, cut-vertices, isomorphic graphs, trees, forests).

    In addition to the concepts above, I learned two different ways to represent graphs in (computer) memory. One approach is using adjacency matrix, a matrix whose columns and rows are represented by each vertex. The second way to represent a graph is by an incidence matrix; unlike an adjacency graph, an incidence matrix’s columns are vertices, the rows representing edges.

    Star like graph
    Star like graph
    Adjancey and Incidence matrix
    Two ways to represent the (above) graph

    Although the lesson only scratched the surface on graph theory, I reveled in the fact that many of the terms I learned were actually terms that I encountered at work, terms that I had no idea were rooted in mathematics.

    For example, the term forest (in graph theory) is defined as a collection of trees; I encountered this term while working as a system administrator, a time when I managed Active Directory (a centralized authentication system) and all the security rules and policies bound to a forest.

    In addition to the example above, I find comfort in the fact that some data structures (e.g. binary search tree) and algorithms (e.g. breadth first search) that I’ve studied in the past build on top of graph theory.

    Star-like graphs that are nonisomorphic
    Star-like graphs that are nonisomorphic

    In any case, I really enjoyed learning about graph theory, a lesson I’ll continue to build on top of as I pursue my life long study of mathematics and computer science.

  • Data structures and algorithms in Java – Inspectional reading

    My plan on completing Data Structures and Algorithms in Java by December 31st, 2016, requires changing my reading strategy. Reading technical books cover to cover, like a novel, is rewarding but inefficient and impractical.  How about a different approach ?

    I’m applying active reading principles from How to read a book[1] and A mind for numbers: active reading by questioning author’s intention, indentifying questions author is attempting to answer.

    I skimmed the 15 chapters (although I had already read the chapters on maps) and below are my notes.

    The skeleton

    Before I skimmed each chapter, I paused and flipped to the page that maps each chapter/section to the ACM[2].  I bookedmarked this page because it helps frame the entire study session — constant questioning of what, and why, I am reading the material.

    goodrich-acm-relationship
    Mapping between book and ACM guidelines

     

    Chapter 1

    Introduction to java and basic programming principles. Conditionals, loops — basic stuff.

    Chapter 2

    Object oriented design. Introduction to design patterns such as the adapter pattern.  The three criteria for good software: robustness (handling unexpected inputs), adaptiblity (adapting to changes and requirements), reusability (reusing code somewhere else).

    Chapter 3

    Fundamental data structures including arrays and linked lists — singlely, doubley and circular linked lists.  Covers the advantage of implementing linked lists over arrays, and vice versa.  Compares shallow and deep copies: shallow method copies pointers but deep instantiates another copy of the reference.

    Chapter 4

    Establishing nomenclature for comparing performacne: big-o. The question of what and why big-o notation, and its history. The three levels of performance: best, worst, and avergae.  The 7 mathematical functions, described in ascending order, from best to worst:

    1. Constant time
    2. Logaritmic
    3. Linear
    4. N(Logarithmic)
    5. Quadratic
    6. Cubed
    7. Exponential

    Chapter 5

    Recursion.  Impleneting a recursive search algorithm: binary search.  When is recursion a bad idea?  What a tail recursion is and how to eliminate it.

    Chapter 6

    Additional data structures.  Introducing additional abstract data types (ADT): stacks and queues. Two primary methods of implementing a stack: array and linked list.  Leveraging modulus technique to maintain a circular queue.   Dequeuepronounced as “deck”, implements properties of both stack and queue: add to head or tail, and remove from head or tail.

    Chapter 7

    High level concept of object oriented programming.  Instead of relying on the client to initially size the array, a more client-friendly solution is offered: dynamic arrays.  The iterator interface is introduced, as well as positional lists.

    Chapter 8

    Generic trees and ordered trees – including binary and binary search trees, and techniques to traverse: preorder, postorder. Peek at breathe depth first — I thought this concept was reserved for graphs.  The chapter ends use cases and trees can model and solve real problems.

    Chapter 9

    Additional ADT: priority queue, which can be implementing with a positional list or a sorted list.  The positional list offers O(1) to insert, but is O(N) for retrieving the minimum.  The sorted list can, instead, offer O(1) for minimum but O(N) for inserting.  Is there a data structure that offers the best of both worlds, something in between? Yes. The Heap.  The heap offers log(N) for both insert and retrieving the minimum value.  Additional features, such as removing an entry (not minimum) from a priority queue, require adapting the ADT.

    Chapter 10

    The map ADT is introduced.  A primitive data structure to implement a map, via an array, is introduced, by followed the hashtable.  The chapter ends with sorted maps, skip lists and sets.

    Chapter 11

    Binary search trees, a special type of tree.  To combat the worst case scenario (inserting sorted items produces, effectively, a linked list) another ADT is introduced: AVL trees. Two other balancing trees are introduced: splay and (2,4) trees. Chapter ends with Red-Black trees.

    Chapter 12

    Up until Chapter 12, only two elementary sorting algorithms have been introduced: insertion sort and select sort. Divide and conquer, an algorithm design pattern, is introduced with two concrete implementations: merge and quick sort — both with a runtime complexity of O(nlogn).  Radix and bucket is an improvement in runtime complexity — O(N).  The concept of stable sorting algorithms are introduced, and we face another type of issue: selecting the correct items.

    Chapter 13

    Pretty neat topic on tackling “string matching.” Discusses a primitive method for string matching — I think we can I guess how this brute force method works — followed by advanced techniques: The Boyer-Moore algorithm and Knute-Morris-Prath Algorithms. Covers compression algorithms, including the huffman code. Covers dynamic programming, a method for matching a sequence of strings but reduces the runtime complexity, from exponential, by implementing “a few loops”.

    Chapter 14

    Introduces another ADT – graphs and respective terms: edges, vertices, directed, undirected.  Presents two different strategies for keeping track of neighbors: adjaceny matrix and adjancey list.  Discusses (2) methods for searching for nodes: depth search first or breathe search first.  Covers several methods for constructing a minimum spanning tree.  Like OSPF, introduces Dijastreka’s algorithms for determining shortest paths.

     

    Chapter 15

    Dives into characteristics of JVM: garbage collection. Introduces the PC — not personal computer — to keep track of function/procedure calls.  Compares the types of memory and the hiearchy in terms of performance.  How data is stored on external disks, such as B-Tree.

    Footnotes

    [1] – How to read a book – http://pne.people.si.umich.edu/PDF/howtoread.pdf and a similar article: https://www.farnamstreetblog.com/how-to-read-a-book/
    [2] – ACM – https://www.acm.org/education/CS2013-final-report.pdf