time complexity analysis
Time complexity analysis
Amortized analysis
Amortized Analysis is used for algorithms where an occasional operation is very slow, but most of the other operations are faster.
The example data structures whose operations are analyzed using Amortized Analysis are Hash Tables, Disjoint Sets and Splay Trees.
The complexity analysis using worst case gives a loose upper bound. Amortized analysis gives a tighter upper bound. In Amortized Analysis, we analyze a sequence of operations and guarantee a worst case average time which is lower than the worst case time of a particular expensive operation.
List operations
Inner working: Python lists are internally represented as arrays. The idea used is similar to implementation of vectors in C++ or ArrayList in Java. The costly operations are inserting and deleting items near the beginning (as everything has to be moved). Insert at the end also becomes costly if preallocated space becomes full.
Append
l.append(item)
O(1)
Clear
l.clear()
O(1)
Contain item
in/not in l
O(N)
Copy
l.copy()
O(N)
Delete
del l[i]
O(N)
Extend
l.extend(…)
O(N)
Equality
l1==l2, l1!=l2
O(N)
Index
l[i]
O(1)
Iteration
for item in l:
O(N)
Length
len(l)
O(1)
Multiply
k*l
O(k*N)
Min, Max
min(l), max(l)
O(N)
Pop from end
l.pop(-1)
O(1)
Pop intermediate
l.pop(item)
O(N)
Reverse
l.reverse()
O(N)
Slice
l[x:y]
O(y-x)
Sort
l.sort()
O(N*log(N))
Store
l[i]=item
O(1)
Deque
Deque is implemented using a doubly linked list inside. If you need to add/remove at both ends, consider using a collections.deque instead of list.
Copy
l.copy()
O(1)
append
l.append()
O(1)
appendLeft
l.appendleft
O(1)
pop
l.pop()
O(1)
popleft
l.popleft()
O(1)
extend
l.extend(…)
O(K)
extendleft
l.extendleft(…)
O(K)
remove
l.remove(…)
O(n)
Dictionary operations
Clear
d.clear()
O(1)
Construction
dict(…)
O(len(d))
Delete
del d[k]
O(1)
Get
d.get()
O(1)
Iteration(key, value, item)
for item in d:
O(N)
Length
len(d)
O(1)
Pop
d.pop(item)
O(1)
Pop Item
d.popitem()
O(1)
Returning Views
d.keys()
O(1)
Fromkeys
d.fromkeys(seq)
O(len(seq))
Set operations
Inner working: Set in Python can be defined as the collection of items. In Python, these are basically used to include membership testing and eliminating duplicate entries. The data structure used in this is Hashing, a popular technique to perform insertion, deletion and traversal in O(1) on average. The operations on Hash Table are some what similar to Linked List. Sets in python are unordered list with duplicate elements removed.
Add
s.add(item)
O(1)
Clear
s.clear()
O(1)
Copy
s.copy()
O(N)
Containment
item in/not in s
O(1)
Creation
set(…)
O(len(s))
Discard
s.discard(item)
O(1)
Difference
s1-s2
O(len(s1))
Difference Update
s1.difference_update(s2)
O(len(s2))
Equality
s1==s2, s1!=s2
O(min(len(s1), len(s2)))
Intersection
s1 & s2
O(min(len(s1), len(s2)))
Iteration
for item in s:
O(N)
Is Subset
s1<=s2
O(len(s1))
Is Superset
s1>=s2
O(len(s2))
Pop
s.pop()
O(1)
Union
s1
s2
Symmetric Difference
s1^s2
len(s1)
Sorted container inner working
Inner implementation: http://www.grantjenks.com/docs/sortedcontainers/implementation.html
First impl intuition:
The first is that Python’s list is fast, really fast. Lists have great characteristics for memory management and random access.
The second is that bisect.insort is fast. This is somewhat counter-intuitive since it involves shifting a series of items in a list
Second impl intuition: But using only one list and bisect.insort would produce sluggish behavior for lengths exceeding ten thousand. So the implementation of Sorted List uses a list of lists to store elements. In this way, inserting or deleting is most often performed on a short list. Only rarely does a new list need to be added or deleted.
Compare with tree based impl
Traditional tree-based designs have better big-O notation but that ignores the realities of today’s software and hardware. For a more in-depth analysis, read Performance at Scale. Compared to tree-based implementations, using lists of lists has a few advantages based on memory usage.
Most insertion/deletion doesn’t require allocating or freeing memory. This can be a big win as it takes a lot of strain off the garbage collector and memory system.
Pointers to elements are packed densely. A traditional tree-based implementation would require two pointers (left/right) to child nodes. Lists have no such overhead. This benefits the hardware’s memory architecture and more efficiently utilizies caches.
The memory overhead per item is effectively a pointer to the item. Binary tree implementations must add at least two more pointers per item.
Iteration is extremely fast as sequentially indexing lists is a strength of modern processors.
SortedList operations
http://www.grantjenks.com/docs/sortedcontainers/sortedlist.html
SortedSet operations
http://www.grantjenks.com/docs/sortedcontainers/sortedset.html
SortedDict operations
http://www.grantjenks.com/docs/sortedcontainers/sorteddict.html
References
Last updated