Study & Book Report : Data Structures and Algorithms in Java ~ Goodrich/Tamassia/Goldwasser

Best book on data structures and algorithms I have read so far.
Checkout(Affiliate Links):

Just wanted give a quick update on my regular study. I have been reading DS/Algo book by Goodrich/Tamassia for quite sometime now. This is turning out to be an excellent read. Finished chapter 11. Still 4 more chapters to go.

This book is not like your typical algorithm book e.g. "Introduction to Algorithms" or "The Algorithm Design Manual". It's more like a hands on approach for data structures and algorithms. Actually this book is mainly focused on data structures rather than algorithm. Of course i like that. Because there are ton of book out there that are more focused on algorithm rather than data structure. But i needed something which is more focused on data structures than algorithm.

The best thing about this book i found is that, it's not just talking about procedure of an algorithm but it goes ahead and shows you how you can implement that on your own. Of course i tried to use many online videos and resources as well, when it became harder to understand from the book just by reading. For example, when trying to understand Skip-Lists, BST, or Red-Black Tree i had to use multiple online resources to clearly get the idea of those algorithms. Although this made my study longer than usual but in the end i believe it is needed for my future understanding. I will try to list the external resources that i have used along with this book.

If you are somewhat familiar with programming and OOP in general you can gloss over the first couple of chapter without any problem. This is specially true for first 2 chapters.

This is also somewhat true for chapter 4 & 5. There are many good resources out there for algorithm analysis and recursion. I think it would be better if you can find some other resources along with these. I would say the meat of the book starts from chapters 3 and 6.

Chapter 3 : "Fundamental Data Structures"

As the name suggests this chapter touches on all the fundamental data structure based upon which, all the other super data structures derives. Array and LinkedList are crucial for ever other kind of data structure. Want stack? You can do that with Array or LinkedList. Want Queue? ArrayQueue or LinkedQueue? Tree? BST? Heap? PriorityQueue? Graph? You name it. Everything can be made from this two fundamental data structures. So, understanding them is crucial for any kind DS you will work with in future.

Chapter 6 : "Stacks, Queues, and Deques"

I was somewhat familiar with Stacks and had some understanding about Queue but not with Deque(Pronounced: "Deck"). Don't pronounce "dequeue" because "enqueue" and "dequeue" is the process of inserting and deleting element from regular "Queue". Yes, before reading this book i also thought Deque(deck) was dequeue :) . Stacks and Queues are used frequently when working with different searching algorithms like bfs/dfs . But i have found stacks are more prevalent than queues. Deque has its own uses but not as common as stacks/queues.

Chapter 7 : List and Iterator ADTs

This chapter actually does not introduces anything new but it builds a strong basis for the chapters ahead. It actually augment chapter 3 and add some more constraint and facilities to our existing Array and LinkedList ADT. For example how do you make dynamic array from regular fixed length array. It also introduces the amortization concept for dynamic array. But i found amortized analysis lectures https://www.coursera.org/lecture/data-structures/dynamic-arrays-EwbnV by Neil Rhodes are excellent. Mainly this chapter builds a framework for upcoming chapters.

Chapter 8 : Trees

I have already been exposed to trees many times already, so i was able to finish this chapter quickly. https://www.youtube.com/playlist?list=PL2_aWCzGMAwI3W_JlcBbtYTwiQSsOTa6P this playlist helped me a lot. If you can finish this playlist start to finish it will help you a lot going forward. Excellent resource. I keep coming back to it, from time to time.

Chapter 9 : Priority Queues

An important ADT to keep in your toolbox. Previously i always confused myself with queue and priority queue and heap. NO, they are not the same. They have different use cases for different purposes.

"Queue" is FIFO : First in first out. And "Priority Queue" is MIFO (Most Important First Out). No, MIFO is not a text book term. It's just for my own convenience. It means you extract item from your priority queue based on their priority. Priority Queue can be implemented using Array or LinkedList(with O(n) to find or remove min/max), it's up to you. But they are mostly implemented with another popular tree based data structure called Heap. It's not that Heap and Priority Queue are the same. But Heap has a natural tendency to keep everything balanced as well as keep max/min element to the top according to the requirement. Which helps with the efficiency of the algorithm O(logn). And it so happens that priority queue need something like this to maintain its element's priority. That is why people usually gets confused whether both are same or not. Since heap is natural, to maintain its element's priority and with some basic tweaking we can use heap as our priority queue. Now it's really up to you how you want to think about these two :) . But they are used in many famous graph algorithms. So, don't take them lightly...

Another important use case of Heap is for heapsort, which is yet another O(nlogn) sorting algorithm. But the interesting thing about this sorting is that you are not explicitly trying to sort the element. It is somewhat bonus for this data structure. Insert the element in this data structure as usual. And after that, keep extracting the min element to get the all elements. And voila all the items are now sorted.

Chapter 10 : Maps, Hash Tables, and Skip Lists

For dictionary or key value based data structure we have map. This chapter introduces us to map ADT. Also you will start to get acquainted with Hash Table, Sorted Map, Skip Lists, Set ADTs as well. Every one of these has their own special uses. Among these i think, HashMap and HashSet are the most popular one for their near constant lookup performance.

I say near constant because with hash table, you will not get perfect constant time lookup due to hash collision. Of course you can try to avoid hash collision using Chaining or Open Addressing but those will increase the lookup time by some margin. So, you have to be aware of that.

Although there are other Map implementations like SortedMap or TreeMap but they have different uses than HashMap. One is used as look up table and the other is used for organizing our data based on key/value which we can query to get our desired result.

Another interesting data structure you will encounter is Skip List. The most clever data structure i have seen so far, nothing fancy. Really intuitive yet clever. Once you know about this one, you will feel like you could've come up with this one easily. Yet you didn't :). Although we typically don't get anything better than O(n) with our regular LinkedList but with some clever thinking and tweaking if we can turn our LinkedList into Skip List we can get pretty remarkable performance which is on the order of O(logn). So, with Skip List we can get expected O(logn) search/insert/removal time with high probability .

Chapter 11 : Search Trees

Trees and Graphs are so common and so important in computer science, i feel like you can spend your entire life just studying them :) . This chapter covers one of the most important topic of tree data structure and that is Binary Search Tree (BST). Although BST starts to loose its glory when we encounter any unbalance situation but if you can add balance to your BST it truly becomes a great data structure.

So, BST is great only if they are balanced. To keep them balanced you will see couple of algorithms like AVL Tree, Splay Tree, 2-4 Tree, Red-Black Tree. Although i feel every one of them is important to understand but if you ask me i would give priority to Splay Tree and Red-Black Tree. As of my understanding these two are used heavily in practice. Btw don't compare Splay Tree with Red-Black Tree. Splay Tree is different beast than Red-Black Tree. If you want to compare Red-Black Tree with something, then compare it with AVL or 2-4 Tree. Although all of them guarantee O(logn) performance where as Red-Black Tree requires less rotation, that is why in practice people use Red-Black Tree more often than others. To understand these concept i had to use multiple resources. Here are couple of them:

https://archive.org/details/ucberkeley_webcast_G5QIXywcJlY

and many more.

For Splay Tree, this is another clever data structure which keep itself balanced by some clever thinking. As you will find out even though other trees keep themselves strictly or roughly balanced but they do it with some extra cost. But not Splay Tree. It's just do :) . Not only that, since we do splaying with every operation in some situation you might discover that you are getting your data in constant time.

Chapter 12 : Sorting and Selection

Reading now...

Am not finished yet... I will be updating this post as i finish this book.