Linear data structures are organized in a particular order, and it is done so because the elements are ordered in a particular pattern.
In the first part of this document you will learn:
We use Big O Notation to describe the performance of the algorithm, time complexity and space complexity to express it, and how scalable the algorithm is.
The below codes show O(1) because the code runs in a constant time.
public void log(int[] numbers)
{
//o(1)
Console.WriteLine(numbers[0]);
}
public void logLoop(int[] numbers , string[] texts)
{
//O(n) == O(n + m)
//o(n)
foreach (var item in numbers)
Console.WriteLine(item);
//o(m)
foreach(var item in texts)
Console.WriteLine(item);
}
public void LogNestedLoop(int[] numbers)
{
//o(n^3)
foreach (var item in numbers)
foreach (var item1 in numbers)
foreach (var item2 in numbers)
Console.WriteLine(item2);
}
Row | Name | Big O Notation |
---|---|---|
1 | Constant | O(1) |
2 | Logarithmic | O(log n) |
3 | Linear | O(n) |
4 | Quadratic | O(n^ 2) |
5 | Exponential | O(2 ^ n) |
As shown in the above image, constant-type algorithms grow linearly at the same rate but the logarithmic curve is down at some point, logarithmic is more efficient and scalable compared to linear or others. Exponential is the opposite of logarithmic and is not efficient to use and has a lot cost.
Space complexity refers to the total amount of memory space used by an algorithm/program, including the space of input values for execution.
It calculates the time taken to run an algorithm as the input grows. In other words, it calculates the worst-case time complexity of an algorithm. Big O Notation in Data Structure describes the upper bound of an algorithm's runtime.
Arrays are a simple data structure type used to store data sequentially, if you want to store data by index arrays are the best choice.
Features of Array Data Structure:
- Simple Data Structure
- The Items are fixed
- The length of the array is fixed
- You can not remove or add new Item
Example of how to add Array class Add, Remove, IndexOf and Print methods functionality:
internal class Array
{
private int[] array;
private int count;
public Array(int length)
{
array = new int[length];
}
public void Add(int item)
{
if (count == array.Length)
{
int[] copyArray = array;
array = new int[count * 2];
for (int index = 0; index < count; index++)
array[index] = copyArray[index];
}
array[count++] = item;
}
public void Print()
{
if (count > 0)
for (int index = 0; index < count; ++index)
Console.WriteLine(array[index]);
}
public void Remove(int index)
{
if (index > count || index < 0)
throw new IndexOutOfRangeException();
for (int i = index; i < count; ++i)
array[i] = array[i + 1];
count--;
}
public int IndexOf(int value)
{
for (int i = 0; i < count; i++)
if (array[i] == value)
return i;
return -1;
}
}
Row | DataType | Default Value |
---|---|---|
1 | boolean | false |
2 | int | 0 |
3 | double | 0.0 |
4 | String | null |
5 | User-Defined Type | null |
Linked List stores objects in sequence, unlike Arrays, which can grow or shrink automatically like a below image.
In the linked list we have the first and last item of a list and recursively add another item to them.
The linked list implementation from scratch:
internal class LinkedList
{
private class Node
{
public int value;
public Node next;
public Node(int value)
{
this.value = value;
}
}
private Node first;
private Node last;
public int size;
public void AddFirst(int item)
{
var mNode = new Node(item);
if (first == null)
last = first = mNode;
else
{
mNode.next = first;
first = mNode;
}
size++;
}
public void AddLast(int item)
{
var mNode = new Node(item);
if (first == null)
last = first = mNode;
else
{
last.next = mNode;
last = mNode;
}
size++;
}
public int IndexOf(int item)
{
int index = 0;
var current = first;
while (current != null)
{
if (current.value == item) return index;
current = current.next;
index++;
}
return -1;
}
public bool Contains(int value)
{
var current = first;
while (current != null)
{
if (current.value == value) return true;
current = current.next;
}
return false;
}
public void RemoveFirst()
{
if (first == null)
throw new InvalidOperationException();
if (first == last)
{
first = last = null;
size--;
return;
}
var second = first.next;
first.next = null;
first = second;
size--;
}
public void RemoveLast()
{
var current = first;
while (current != null)
{
if (current.next == last) break;
current = current.next;
}
last = current;
size--;
}
public int Size()
{
return size;
}
public int[] toArray()
{
int[] result = new int[size];
var current = first;
int index = 0;
while (current != null)
{
result[index++] = current.value;
current = current.next;
}
return result;
}
}
As you can see above code we have a Node class that stores the value of the item and the next item that creates a connection to the next item.
There are two different types of access: random access and sequential access
-
Sequential access means reading the elements one by one like Linked List.
-
Random access means you jump directly to the 10th element like Array.
We want to compare in terms of space and time complexity.
-
Static arrays have a fixed size
-
Dynamic arrays grow by 50-100%
-
Linked Lists don't waste memory
-
Use arrays if you know the number of items to store
Time Complexity
Row | Function name | Array | Linked List |
---|---|---|---|
1 | (LookUp) By Index | O(1) | O(n) |
2 | (LookUp) By Value | O(n) | O(n) |
3 | (Insert) Beginning/End | O(n) | O(1) |
4 | (Insert) Middle | O(n) | O(n) |
5 | (Delete) Beginning | O(n) | O(1) |
6 | (Delete) Middle | O(n) | O(n) |
7 | (Delete) End | O(n) | O(n) |
We have two types of LinkedIn listed Singly and Doubly and what is the difference between them? The items singly just have a link to the next item when you want to delete the item from the end you should loop in all items of Linked Listed but we can add a reverse link to each item stored next and previous item that case O(1) Big O like the image below.
Stacks are powerful data structures that help us to solve many complex programming problems. A Stack is a linear data structure that follows the LIFO (Last-In-First-Out) principle. It contains only one pointer top pointer pointing to the topmost element of the stack. Whenever an element is added to the stack, it is added on the top of the stack, and the element can be deleted only from the stack.
- Stacks are LIFO (Last-In First-Out) This term refers to the fact that new elements are added to the beginning of the stack, and the first element to be retrieved is the last added element.
-
push(item) Add an item on top of the stacks
-
pop() Remove an item from the top
-
peek() Return an item from the top without removing the item
-
isEmpty() Determine stack is empty or not
All operations of stacks run O(1) in a constant time.
1- We can use stacks to reverse strings like below code:
internal class StringReverser
{
public string Reverse(string input)
{
string newString = "";
Stack<Char> chars = new Stack<Char>();
foreach (var item in input.ToArray())
chars.Push(item);
while (chars.Count > 0)
{
newString += chars.Pop().ToString();
}
return newString;
}
}
2- Stack implementation from scratch
internal class Stack
{
private int[] items = new int[5];
private int count;
public void push(int item)
{
if (count == items.Length)
throw new StackOverflowException();
items[count++] = item;
}
public int pop()
{
if (count == 0)
throw new Exception();
return items[--count];
}
public int peek()
{
if (count == 0)
throw new Exception();
return items[--count];
}
}
A queue can be defined as an ordered list that enables insert operations to be performed at one end called Back(REAR) and delete operations to be performed at another end called FRONT as shown below:
- Queues are FIFO (First-In First-Out) It means the first element to be retrieved from the queue is the first added element.
Queues are similar to real work queues, in which people join the line from the back and leave from the front.
-
enqueue() Add an item to the back of the queue.
-
dequue() Remove an item from the front of the queue.
-
peek() Get an item from the front without removing it.
-
isEmpty()
-
isFull()
All operations of Queues run O(1) in a constant time and we can say is so fast.
internal class ArrayQueue
{
int[] array;
int frontIndex;
int backIndex;
int count;
public ArrayQueue(int size)
{
array = new int[size];
}
public void enqueue(int item)
{
if (count >= array.Length)
throw new Exception();
array[frontIndex] = item;
frontIndex = (frontIndex + 1) % array.Length;
count++;
}
public int dequeue()
{
int current = array[backIndex];
array[backIndex] = 0;
backIndex = (backIndex + 1) % array.Length;
count--;
return current;
}
public int peek()
{
return array[backIndex];
}
public Boolean isEmpty()
{
return count == 0;
}
public Boolean isFull()
{
return count == array.Length;
}
}
above code uses a circular array to solve a problem of fixed array size.
Implementing a queue with stack needs two stacks, stack1 for enqueue and stack2 for dequeue.
internal class StackQueue
{
Stack<int> stack1;
Stack<int> stack2;
public StackQueue()
{
stack1 = new Stack<int>();
stack2 = new Stack<int>();
}
//O(1)
public void enqueue(int item)
{
stack1.Push(item);
}
//O(n)
public int dequeue()
{
if (stack1.Count == 0 && stack2.Count == 0)
throw new Exception();
if (stack2.Count == 0)
while (stack1.Count > 0)
{
stack2.Push(stack1.Pop());
}
return stack2.Pop();
}
public bool isEmpty()
{
return stack1.Count == 0 && stack2.Count == 0;
}
}
What is PriorityQueue?
PriorityQueue is similar to Queue but it does not depend on the priority of adding an item it depends on the second item that is a priority:
// First param of class is value and second of it is priority
PriorityQueue<int,int> priorityQueue = new PriorityQueue<int,int>();
priorityQueue.Enqueue(2, 1);
priorityQueue.Enqueue(3, 3);
priorityQueue.Enqueue(4, 2);
Hash tables also called Dictionary, give us super fast lookup and we can use it for:
-
Spell checkers
-
Dictionaries
-
Compilers
-
Code editors
What do we call other programming languages' Hash Tables implementation:
Java | JavaScript | Phyton | C# |
---|---|---|---|
Hash Map | Object | Dictionary | Dictionary |
A hash function is a function where you put in a string and you get back a number, so a hash function maps strings to numbers.
-
Insert()
-
Lookup()
-
Delete()
Hash Table Store data with key and value pairs and All above operations run in O(1).
Dictionary in C# code with time complexity:
Dictionary<int, string> part1 = new Dictionary<int, string>();
part1.Add(0, "Mohsen"); //O(1)
part1.Add(1, "Jhon");
var containsValue = part1.ContainsValue("Mohsen"); //O(n)
var containsKey = part1.ContainsKey(10); //O(1)
var dictionaryItem = part1.Single(x => x.Key == 0); //O(1)
internal class CharFinder
{
public Char FindFirstNonRepeatingChar(string input)
{
Dictionary<char, int> map = new Dictionary<char, int>();
foreach (var item in input.ToArray())
{
var count = map.ContainsKey(item) ? map[item] : 0;
map[item] = count + 1;
}
foreach (var item in map)
if (item.Value == 1) return item.Key;
return ' ';
}
}
First, we count all chars and store them in the dictionary after that find from the dictionary item which is 1 and return it.
2- Create a Hash Table with Array and Linked List:
internal class HashTable
{
internal class Entry
{
public int Key;
public string Value;
public Entry(int key, string value)
{
Key = key;
Value = value;
}
}
private LinkedList<Entry>[] _array;
public HashTable(int size)
{
_array = new LinkedList<Entry>[size];
}
public void put(int key, string value)
{
var hashkey = hash(key);
if (_array[hashkey] == null)
{
var linked = new LinkedList<Entry>();
linked.AddLast(new Entry(key, value));
_array[hashkey] = linked;
}
else
{
foreach (var item in _array[hashkey])
{
if (item.Key == key)
{
item.Value = value;
return;
}
}
var linkedList = _array[hashkey];
linkedList.AddLast(new Entry(key, value));
}
}
public Entry get(int key)
{
var hashkey = hash(key);
var item = _array[hashkey];
if (item != null)
return item.Single(x => x.Key == key);
else
return null;
}
public void remove(int key)
{
var hashkey = hash(key);
if (_array[hashkey] == null)
throw new Exception();
else
{
var item = _array[hashkey];
if (item.Count == 1)
_array[hashkey] = null;
else
{
var innerItem = item.Single(x => x.Key == key);
_array[hashkey].Remove(innerItem);
}
}
}
private int hash(int key)
{
return key % _array.Length;
}
}
for having constant O(1) time complexity we should use Array to store, in the above code why do we use Linked List? imagine we have an Array with 5 fixed sizes and the key should be between 0 to 4, if the input key is 11 what happens?
In this scenario we should hash the key, if the key is an integer we get the remain of the key divided by array length, we use the % symbol (Modulo) and this issue with fixed, for example:
var index = key % array.length
1 = 11% 5;
1 = 1% 5;
Another error might occur that the hash function generates two equal hash values, we can not store two items in the same index, this is what we call a Collision like the below image.
If we use a good hash function the collisions will not happen a lot.
We have two ways to fix this issue
1- Chaining: Store multiple items in a Linked List as shown in the above example.
As you can see we store A and C items in an index one but in a linked list.
2- Open Addressing: In this method, we do not insert duplicated data in the Linked List, we store them in an Array itself.
Types of Open Addressing:
- Linear Probing: We linearly probe/look for the next slot. If slot [hash(x)%size] is full, we try [hash(x)%size+1]. If that is full too, we try [hash(x)%size+2]…until an available space is found. Linear Probing has the best cache performance but the downside includes primary and secondary clustering.
-
Quadratic Probing: We look for the i²th iteration. If slot [hash(x)%size] is full, we try [(hash(x)+11)%size]. If that is also full, we try [(hash(x)+22)%size]…until an available space is found. Secondary clustering might arise here and there is no guarantee of finding a slot in this approach.
-
Double Hashing: We use a second hash function hash2(x) and look for the ihash2(x) slot. If slot [hash(x)%size] is full, we try [(hash(x)+1hash2(x))%size]. If that is full too, we try [(hash(x)+2*hash2(x))%size]…until an available space is found. No primary or secondary clustering but lot more computation here.
You will almost never have to implement a hash table yourself. The programming language you use should provide an implementation for you.
Nonlinear elements of data structures aren't ordered in a particular way, as opposed to linear structures. They are arranged in a hierarchical manner in which each element may be linked to one element. The graph and tree-based structures divide those that are nonlinear.
In this part, you will learn:
The Tree is a data structure that stores elements in the hierarchy, referring to elements called nodes and connected lines called edges.
A Binary Tree, which is the most basic form of a tree structure, is a data structure where each node has at most two children. Those two children are referred to as the left and right children.
Usage:
-
Represent hierarchical data
-
Databases
-
AutoCompletion
-
Compilers
-
Compression(Mp3 ,..)
Different Types of Binary Tree:
-
Full Binary Tree
We have a full binary tree when every node of the tree has two or zero children.
-
Complete Binary Tree
A complete binary tree is formed when every level of the tree is completely filled, except possibly from the last one. Also, the last level has all nodes as left as possible.
-
Perfect Binary Tree
We get a perfect binary tree when all tree nodes have two children, as well as all leaves, are uniform by having the same depth.
-
Balanced Binary Tree
A balanced binary tree is formed when the tree’s height is O(Log n), with n being the number of nodes. We will explain this later on with AVL and Red-Black trees which are the most commonly used balanced binary search trees.
This is a special type of binary tree called Binary Search Tree (BST) because it allows us to quickly look up data in this tree. Values of any right node are always greater than the left one and have Logarithmic time complexity.
What about the time complexity of operations in BST?
As you can see in the above image, all operations in BST run logarithmic.
The implementation of the insert and find method in BST.
internal class Tree
{
internal class Node
{
public int value;
public Node leftChild;
public Node rightChild;
public Node(int value)
{
this.value = value;
}
}
private Node root;
public Tree()
{
}
public void insert(int item)
{
// if root is null add first item to the root item
if (root == null)
root = new Node(item);
else
{
var current = root;
while (current != null)
{
if (current.value < item)
{
if (current.rightChild != null)
current = current.rightChild;
else
{
current.rightChild = new Node(item);
break;
}
}
else
{
if (current.leftChild != null)
current = current.leftChild;
else
{
current.leftChild = new Node(item);
break;
}
}
}
}
}
public bool find(int value)
{
var current = root;
while (current != null)
{
if (current.value < value)
current = current.rightChild;
else if(current.value > null)
current = current.leftChild;
else
return true;
}
return false;
}
}
There are some approaches to traverse the tree, It classified into two main categories
-
Breadth First Seach (BFS)
Also called Level Order traversing which you should visit all the nodes at the same level before visiting the nodes at the next level.
As you can see in the above image, first you visit 7 then 4,9,1,6,8 and 10.
For implementing this search algorithm for Trees and Graphs we should use Queue Data Structure.
-
Depth First Search (DFS)
In this method, we have three different ways
private void traversePreorder(Node root)
{
if (root == null)
return;
Console.WriteLine(root.value);
traversePreorder(root.leftChild);
traversePreorder(root.rightChild);
}
public void traverseInorder(Node root)
{
if (root == null)
{
return;
}
traverseInorder(root.leftChild);
Console.WriteLine(root.value);
traverseInorder(root.rightChild);
}
public void traversePostorder(Node root)
{
if (root == null)
{
return;
}
traversePostorder(root.leftChild);
traversePostorder(root.rightChild);
Console.WriteLine(root.value);
}
For implementing this search algorithm for Trees and Graphs we should use Stack Data Structure.
A node’s height is the number of edges to its most distant leaf node. On the other hand, a node’s depth is the number of edges back up to the root. So, the root always has a depth of 0 while leaf nodes always have a height of 0.
We have two ways to calculate height:
-
Recursive way
-
Iterative way
There is a recursive formula to calculate the height of the Tree, we should calculate the left and right of the tree and compare both of them:
Height = 1 + max(height(L), height(R))
// O(n)
public int height(Node root)
{
if (root == null) return -1;
if (root.leftChild == null && root.rightChild == null)
return 0;
return 1 + Math.Max(height(root.leftChild), height(root.rightChild));
}
1- Find a minimum value in a Tree:
We can implement it in two ways, first, the easiest way is to traverse all nodes of the Tree with recursion that is time complexity of the method is O(n)**
// O(n)
public int min(Node root)
{
if ( root.leftChild == null && root.rightChild == null)
return root.value;
var left = min(root.leftChild);
var right = min(root.rightChild);
return Math.Min(Math.Min(left, right), root.value);
}
As you know in the Binary Search Tree the minimum value is always in the left nodes and we should search in the left child, because of this the time complexity is O(log(n))
//O(log(n)
public int min()
{
if (root == null)
throw new Exception();
var current = root;
var last = current;
while (current != null)
{
last = current;
current = current.leftChild;
}
return last.value;
}
Also if you want to find the greatest value in BST, we should do the opposite and traverse to the left of the tree.
2- Identify whether two trees are identical or not:
public bool equals(Node first, Node second)
{
//if two tree are null
if(first == null && second == null) return true;
if(first != null && second != null)
{
return first.value == second.value
&& equals(first.leftChild, second.leftChild)
&& equals(first.rightChild, second.rightChild);
}
return false;
}
3- Validating Binary Search Tree:
As we know in the binary search tree, right nodes should be greater than left nodes traversing to leaf nodes. The first way to check each node is to traverse all nodes and it costs a lot.
public bool isBinarySearchTree()
{
return isBinarySearchTree(root, int.MinValue, int.MaxValue);
}
public bool isBinarySearchTree(Node root, int min, int max)
{
if (root == null) return true;
if (root.value < min || root.value > max) return false;
return isBinarySearchTree(root.leftChild, min, root.value - 1)
&& isBinarySearchTree(root.leftChild, root.value + 1, max);
}
As mentioned before every operation in BTS is run on O(log(n)) which this condition happens when the tree is balanced. AVL trees are a special type of BST that automatically re-balance themselves every time we add or remove nodes.
In the balanced tree, the height of every node should be:
height(left) - height(right) <= 1
We have a factor to check the balance of the tree called the balanced factor which is the difference in the heights of the left and right subtrees.
BF = height(left) - height(right)
-1 <= BF <= +1
The Balance factor of an AVL Tree should be 0,+1,-1, if is greater or lower than these numbers, the tree should re-balanced itself.
-
AVL Tree (Adelson-Velsky and Landis)
-
Red-black Trees
-
B-trees
-
Splay Trees
-
2-3 Trees
A tree rotation is necessary when we have inserted or deleted a node which leaves the tree in an unbalanced state. We have four types of rotation:
-
Left (LL)
-
Right (RR)
-
Left-Right (LR)
-
Right-Left (RL)
A single rotation is applied when a node is inserted in the right subtree of a right subtree. In the given example, node 1 has a balance factor of 2 after the insertion of node 3. By rotating the tree left, node 2 becomes the root resulting in a balanced tree.
After the Left Rotation and tree has balanced
A single rotation is applied when a node is inserted in the left subtree of a left subtree. In the given example, node 3 now has a balance factor of 2 after the insertion of node A. By rotating the tree right, node 2 becomes the root resulting in a balanced tree.
After Left Rotation and tree-balanced
In the given example, node 7 is causing an imbalance resulting in node 10 having a balance factor of 2. As node 7 is inserted in the right subtree of node 5, a left rotation needs to be applied. However, a single rotation will not give us the required results. Now, all we have to do is apply the right rotation as shown before to achieve a balanced tree.
We should first rotate like the below image
Then we rotate it to the right same as in the below image and now the tree is balanced
In the given example, node 7 is causing an imbalance resulting in node 5 having a balance factor of 2. As node 7 is inserted in the left subtree of node 10, a right rotation needs to be applied. However, just as before, a single rotation will not give us the required results. Now, by applying the left rotation as shown before, we can achieve a balanced tree.
First, we should rotate like the below image
Final rotation
A Heap is a special Tree-based data structure in which the tree is a complete binary tree. Heaps have two characteristics:
-
The Tree should be complete (Nodes should fill from left to right).
-
The value of every node is greater than or to its children.
The heap property says that if the value of the parent is either greater than or equal to (in a max heap ) or less than or equal to (in a min heap) the value of the Child.
MaxHeap: sorted nodes Accenting and root is the largest in the heap and all the values below this are less than this value.
MinHeap: sorted nodes Descending and root is the smallest in the heap and all the values below this are greater than this value.
-
Sorting (HeapSort)
-
Graph Algorithms (shortest path)
-
Priority queues
-
Find the Kth smallest/largest value
In Heaps, we can only delete the root node, and to fill it we should fill the root node with the last leaf of the tree and bubble down if the Heap tree is MaxHeap and the time complexity of removing from this tree is O(log n).
Another operation in the Heap tree is to find the maximum value that is super easy, the root node value is the greatest in a tree and runs on constant time O(1).
- Heapify → Process to rearrange the heap in order to maintain heap-property.
- Find-max (or Find-min) → find a maximum item of a max-heap, or a minimum item of a min-heap, respectively.
- Insertion → Add a new item in the heap.
- Deletion → Delete an item from the heap.
- Extract Min-Max → Returning and deleting the maximum or minimum element in max-heap and min-heap respectively.
The Tries (Prefix Tree) is another type of tree but is not a BST one, each child can have several nodes.
We use tries for auto-complete, for example, google search.
We are not duplicating this character as you can see above, we can have up to 26 subtrees for each node in the English language and the root should be empty.
The time complexity of the operation of this tree depends on the length of the word.
-
Lookup O(L)
-
Insert O(L)
-
Delete O(L)
We can implement tries with Array and Hashmap too.
Array Implemetation:
internal class TriesTree
{
public Node root = new Node(' ');
public class Node
{
public static int ALPHABET_SIZE = 26;
public char Value { get; set; }
public Node[] Children = new Node[ALPHABET_SIZE];
public bool isEndOfWord;
public Node(char value)
{
Value = value;
}
}
public void insert(string word)
{
var current = root;
foreach (var item in word.ToCharArray())
{
var index = item - 'a';
if (current.Children[index] == null)
current.Children[index] = new Node(item);
current = current.Children[index];
}
current.isEndOfWord = true;
}
}
A Graph in Data Structure is a collection of nodes that consist of data and are connected to other nodes of the graph like the below image.
They are also called Vertices. A node can represent anything such as any location, port, houses, buildings, landmarks, etc.
The Degree of a node is the number of edges connecting the node in the graph. A simple example would be, suppose in Facebook, if you have 100 friends then the node that represents you has a degree of 100.
It represents the relationships between various nodes in a graph. Edges are also called the Path in a graph.
The above image represents edges in a graph.
A graph data structure (V, E) consists of:
- A collection of vertices (V) or nodes.
- A collection of edges (E) or path.
In graph theory, a graph that consists of a single cycle is called a cycle graph or circular graph.
-
A Cycle Graph or Circular Graph is a graph that consists of a single cycle.
-
In a Cycle Graph number of vertices is equal to a number of edges.
-
In a Cycle Graph, the Degree of each vertex in a graph is two.
A Graph is said to be connected if every pair of vertices in the graph is connected. This means that there is a path between every pair of vertices.
In weighted graphs, each edge has a value associated with them (called weight). It refers to a simple graph that has weighted edges. The weights are usually used to compute the shortest path in the graph.
The graphs where the edges have directions from one node towards the other node. In Directed Graphs, we can only traverse from one node to another if the edge have a direction pointing to that node.
Directed graphs are used in many areas. One of the usecase you may think of is a family tree, where there can be only the edge directed from parent to children. So, family tree are directed graphs.
Undirected graphs have edges that do not have a direction. Hence, the graph can be traversed in either direction.
The above graph is undirected. Here, the edges do not point to any direction. We can travel through both directions, so it is bidirectional. In these graphs, we can reach to one node, from any other node.
Breadth First Traversal or Breadth First Search is a recursive algorithm for searching all the vertices of a graph or tree data structure.
A standard BFS implementation puts each vertex of the graph into one of two categories:
- Visited
- Not Visited
The purpose of the algorithm is to mark each vertex as visited while avoiding cycles.
The algorithm works as follows:
- Start by putting any one of the graph's vertices at the back of a queue.
- Take the front item of the queue and add it to the visited list.
- Create a list of that vertex's adjacent nodes. Add the ones that aren't in the visited list to the back of the queue.
- Keep repeating steps 2 and 3 until the queue is empty.
The time complexity of the BFS algorithm is represented in the form of O(V + E)
, where V is the number of nodes and E is the number of edges.
The space complexity of the algorithm is O(V)
.
Given a weighted graph and a source vertex in the graph, find the shortest paths from the source to all the other vertices in the given graph.
Note: The given graph does not contain any negative edge.
It is easier to start with an example and then think about the algorithm.
-
Start with a weighted graph
-
Choose a starting vertex and assign infinity path values to all other devices.
-
Go to each vertex and update its path length.
-
If the path length of the adjacent vertex is lesser than new path length, don't update it.
-
Avoid updating path lengths of already visited vertices
-
After each iteration, we pick the unvisited vertex with the least path length. So we choose 5 before 7.
-
Notice how the rightmost vertex has its path length updated twice
-
Repeat until all the vertices have been visited
The first approach to impelement graph (Adjacency Matrix) is to use a two-dimensional Array and implement using a Hash Table like the below image:
If two nodes are connected, the value should be 1 otherwise 0.
The problem with this approach is the amount of space needed and space complexity is O(n^2) and grows fast.
-
Add Edge = O(1)
-
Remove Edge = O(1)
-
Query Edge = O(1)
-
Find Neighboars = O(V)
-
Add Node = O(V^2)
-
Remove Node = O(V^2)
If you know how many nodes you need and you are not going to add an item this type is great to use because the time complexity of edges is O(1)
The second approach is (Adjacency List) and is implemented with an Array of Linked Lists every element of an Array has a Linked List and contains nodes that are more space efficient with the first one and space complexity is O(V + E)
The worst-case scenario is when every node is connected to every other node we say we have a Dense Graph
K = Number of edges of each Node.
Average scenario:
Worst case scenario:
As you can see in the above images if you have a dense graph both approaches take the same amount of space and the time complexity of the Matrix is better, so if we dealing with a dense graph is better to implement with Matrix but in reality, Adjancency List is better than Matrix One.
In this part of the document you will learn:
Recursion is when a function calls itself.
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program, and in software development, the call stack is what a program uses to keep track of method calls.
While you're calling a method that has another method in it, What will happen?
The computer stores Pushes the main method first to the Stack and then Pushes the inner method to the Stack and starts to calculate from the Stack by popping the method and calling it.
As we know each function call has an input and return value, where do these values store? These values are stored in the Stack Frame.
What does Stack Frame have?
- Local variables
- Arguments passed into the method
- Information about the caller's stack frame
- The return address—what the program should do after the function returns (i.e.: where it should "return to"). This is usually somewhere in the middle of the caller's code.
A Recursive function can be defined as a routine that calls itself directly or indirectly.
In recursive functions on each recursion, we will have one Stack Frame that takes memory until finishing the function and recursive calls end up with huge call stacks.
The Recursive function has two parts:
Base Case: The base case is when the function doesn't call itself again, so it doesn't go into an infinite loop.
Recursive Case: The recursive case is when the function calls itself.
public void CountDown(int i)
{
Console.WriteLine(i);
// Base Case
if(i<=0)
{
return;
}
//Recursive Case
else
{
Countdown(i-1);
}
}
There's no performance benefit to using recursion; in fact, loops are sometimes better for performance. Loops may achieve a performance gain for your program, but Recursion may achieve a performance gain for your programmer.
While calling a recursive function, each call is stored in Call Stack until the Base Case condition executes, then it will start by popping calls from the last to the first.
The divide-and-conquer technique involves taking a large-scale problem and dividing it into similar sub-problems of a smaller scale and recursively solving each of these sub-problems. Generally, a problem is divided into sub-problems repeatedly until the resulting sub-problems are very easy to solve.
A divide and conquer algorithm is a strategy of solving a large problem by
- breaking the problem into smaller sub-problems
- solving the sub-problems, and
- combining them to get the desired output.
Here are the steps involved:
- Divide: Divide the given problem into sub-problems using recursion.
- Conquer: Solve the smaller sub-problems recursively. If the subproblem is small enough, then solve it directly.
- Combine: Combine the solutions of the sub-problems that are part of the recursive process to solve the actual problem.
In the Sorting algorithm, we learn:
-
How do they work?
-
Time & space complexity
-
Implementation
This one is the simplest type of sort algorithm:
We should scan Array items from left to right, if items are out of order swap them.
In the below image, we should iterate an Array 5 times that is equal to the length of the Array and for each iteration, we should compare 0 & 1 , 1 & 2 , 2 & 3 , 3 & 4
As you can see if the Array is fully unsorted the worst scenario will happen and as the below image shows this sorting algorithm is Inefficient because of a red area.
We have two optimizations in the below code, first, we add a Sorted bool field that when just one item is unsorted, we don't need to iterate all array members.
Another optimization after each loop in Array, we sort one member, however, there is no need to iterate again to all members and we can decrease the second loop each time.
public class BubbleSort
{
public void sort(int[] array)
{
bool isSorted = false;
for (int i = 0; i < array.Length; i++)
{
isSorted = true;
for (int j = 1; j < array.Length - 1; j++)
{
if (array[j] < array[j - 1])
{
swap(array, j, j - 1);
isSorted = false;
}
}
}
if (isSorted)
return;
}
private void swap(int[] array, int index1, int index2)
{
var temp = array[index1];
array[index1] = array[index2];
array[index2] = temp;
}
}
In this sort we need to find a minimum item in Array that is 1 in index 3 we need to swap the item to item zero, Now 1 is in the correct position.
Then our array has two parts, sorted and unsorted like below.
Now we should repeat this action in the unsorted part until all array items are sorted.
As we understand for sorting one item in the list we should apply Simple Search to find the lowest or highest value and we should apply this search for each item to sort then we will have O(n*n) = O(n^2).
This is a fairly slow algorithm
Selection sort implementation
public class SelectionSort
{
public void sort(int[] array)
{
for (int j = 0; j < array.Length; j++)
{
int minindex = j;
for (int index = minindex; index < array.Length; index++)
if (array[minindex] > array[index])
minindex = index;
swap(array, minindex, j);
}
}
private void swap(int[] array, int index1, int index2)
{
var temp = array[index1];
array[index1] = array[index2];
array[index2] = temp;
}
}
The best way to understand insertion sort is by playing a card game, imagine each Array represents a card, every time you get a card insert it in the right position, first you get 8 then you get 2 insert it before then you get 4, you should insert it between 2 and 8 and 1 insert before all item and finally get 3 insert it after 2.
Insersion sort implementation
public class InsertionSort
{
public void sort(int[] array)
{
for (int i = 1; i < array.Length; i++)
{
var current = array[i];
var j = i - 1;
while (j >= 0 && array[j] > current)
{
array[j + 1] = array[j];
j--;
}
array[j + 1] = current;
}
}
}
The idea of merge sort is to break down a list into smaller and smaller sub-lists then sort them and merge them back to each other to a completely sorted list. we should divide an array until we can not divide anymore and should compare each array that has a single item to sort them and again merge them.
We can say Merge Sort is Divide and Conquer algorithm, it works by recursively dividing a problem into smaller sub-problems until is easy to solve. If our array is like below:
We should find an index of the middle by dividing the length of the array.
middle = length/2 => 5/2 = 2
First, we focus on the left sub-array to find the middle and split it.
Now we have a single item array and we should merge and combine two left Array.
Now we should repeat to the right array and split and merge it until the array is fully sorted.
Real Example with numbers:
This algorithm is more efficient than the previous algorithm but it costs allocating more space.
Merge Sort Implementation:
public class MergeSort
{
public void sort(int[] array)
{
if (array.Length < 2)
return;
//divide array into half
var middle = array.Length / 2;
int[] left = new int[middle];
for (int i = 0; i < middle; i++)
left[i] = array[i];
int[] right = new int[array.Length - middle];
for (int i = middle; i < array.Length; i++)
right[i - middle] = array[i];
//sort each half
sort(left);
sort(right);
//Merge the result
merge(left, right, array);
}
private void merge(int[] left, int[] right, int[] result)
{
int i = 0, j = 0, k = 0;
while (i < left.Length && j < right.Length)
{
if (left[i] <= right[j])
result[k++] = left[i++];
else
result[k++] = right[j++];
}
while (i < right.Length)
result[k++] = right[j++];
while (j < left.Length)
result[k++] = left[i++];
}
}
Like Merge Sort, quicksort uses divide-and-conquer and so it's a recursive algorithm. The way that Quicksort uses divide-and-conquer is a little different from how merge sort does.
Quicksort has a couple of other differences from merge sort. Quicksort works in place. And its worst-case running time is as bad as selection sort's and insertion sort's. But its average-case running time is as good as merge sort.
So why think about quicksort when merge sort is at least as good? That's because the constant factor hidden in the big-Θ notation for quicksort is quite good.
Here are the divide, conquer, and combine steps that Quicksort uses:
-
Pick a pivot element
-
Partition, or rearrange, the array into two subarrays, such that all elements are less than and such that all elements are greater than or equal to.
-
Sort the subarrays and recursively with quicksort.
Picking a good pivot is the key for a fast implementation of quicksort; The best-case pivot would divide the array into two equal parts.
- The average-case time complexity to sort an array of n elements is O(n lg n).
- Generally, it runs very fast. It is even faster than merge sort.
- No extra storage is required
- Its running time can be different for different array contents.
- The worst-case quick sort takes place when the array is already sorted.
- It is not stable.
Case | Time Complexity |
---|---|
Best Case | O(n log n) |
Average Case | O(n log n) |
Worst Case | O(n^2) |
Binary Search is a searching algorithm for finding an element's position in a sorted Array. Binary search can be implemented only on a sorted list of items. If the elements are not sorted already, we need to sort them first.
-
This is an Array we want to search X=4.
-
Set two pointers low and high at the lowest and the highest positions respectively.
-
Find the index of the middle element of the array ie.
arr[(low + high)/2] = 3
-
If x == mid, then return mid. Else, compare the element to be searched with m.
-
If
x > mid
, compare x with the middle element of the elements on the right side of mid. This is done by setting low tolow = mid + 1
. -
Otherwise, compare x with the middle element of the elements on the left side of mid. This is done by setting high to
high = mid - 1
. -
Repeat steps 3 to 6 until low meets high and finally find X.
-
Binary search is a lot faster than simple search.
-
O(log n) is faster than O(n), but it gets a lot faster once the list of items you're searching through grows.
-
Algorithm times are measured in terms of the growth of an algorithm.
Case | Time Complexity | Space Complexity |
---|---|---|
Best Case | O(1) | O(1) |
Average Case | O(logn) | O(1) |
Worst Case | O(logn) | O(1) |
A greedy algorithm is an approach for solving a problem by selecting the best option available at the moment.It doesn't worry whether the current best result will bring the overall optimal result.
This algorithm may not produce the best result for all the problems. It's because it always goes for the local best choice to produce the global best result.
- The algorithm is easier to describe.
- This algorithm can perform better than other algorithms (but, not in all cases).
As mentioned earlier, the greedy algorithm doesn't always produce the optimal solution. This is the major disadvantage of the algorithm
For example, suppose we want to find the longest path in the graph below from root to leaf. Let's use the greedy algorithm here.
-
Let's start with the root node 20. The weight of the right child is 3 and the weight of the left child is 2.
-
Our problem is to find the largest path. And, the optimal solution at the moment is 3. So, the greedy algorithm will choose 3.
-
Finally the weight of an only child of 3 is 1. This gives us our final result
20 + 3 + 1 = 24
.
However, it is not the optimal solution. There is another path that carries more weight (20 + 2 + 10 = 32
).
Problem: You have to make a change of an amount using the smallest
possible number of coins.
Amount: $18
Available coins are
$5 coin
$2 coin
$1 coin
There is no limit to the number of each coin you can use.
- Create an empty
solution-set = { }
. Available coins are{5, 2, 1}
. - We are supposed to find the
sum = 18
. Let's start withsum = 0
. - Always select the coin with the largest value (i.e. 5) until the
sum > 18
. (When we select the largest value at each step, we hope to reach the destination faster. This concept is called greedy choice property.) - In the first iteration,
solution-set = {5}
andsum = 5
. - In the second iteration,
solution-set = {5, 5}
andsum = 10
. - In the third iteration,
solution-set = {5, 5, 5}
andsum = 15
. - In the fourth iteration,
solution-set = {5, 5, 5, 2}
andsum = 17
. (We cannot select 5 here because if we do so,sum = 20
which is greater than 18. So, we select the 2nd largest item which is 2.) - Similarly, in the fifth iteration, select 1. Now
sum = 18
andsolution-set = {5, 5, 5, 2, 1}
.