SlideShare a Scribd company logo
Lecture No.01
Data Structures
Muhammad Rehman
2
What is a Computer Program?
• To exactly know, what is data structure?
We must know:
– What is a computer program?
Input
Some mysterious
processing Output
Data Structures
 Prepares the students for the more
advanced material students will encounter
in later courses.
 Cover well-known data structures such as
dynamic arrays, linked lists, stacks,
queues, tree and graphs.
 Implement data structures in C++
4
Example
• Data structure for storing data of students:-
– Arrays
– Linked Lists
• Issues
– Space needed
– Operations efficiency (Time required to complete
operations)
• Retrieval
• Insertion
• Deletion
– Frequency of usage of above operations
5
What data structure to use?
Data structures let the input and output be represented in a way that
can be handled efficiently and effectively.
array
Linked list
tree
queue
stack
Organizing Data
 Any organization for a collection of records
that can be searched, processed in any
order, or modified.
 The choice of data structure and algorithm
can make the difference between a
program running in a few seconds or many
days.
7
7
What’s the difference
• Different types of values
• Different structures
– No structure – just a collection of values
– Linear structure of values – the order matters
– Set of key-value pairs
– Hierarchical structures
– Grid/table
– ….
• Different access disciplines
– get, put, remove anywhere
– get, put, remove only at the ends, or only at the top, or …
– get, put, remove by position, or by value, or by key, or …
– ….
8
8
Good Algorithms?
• Run in less time
• Consume less memory
But computational resources (time
complexity) is usually more important
9
9
Complexity
• In examining algorithm efficiency we must
understand the idea of complexity
• Complexity is the consumptions of resources.
• Most important aspect of complexity are
– Space complexity
– Time Complexity
10
10
Space Complexity
• When memory was expensive we focused on making
programs as space efficient as possible and developed
schemes to make memory appear larger than it really
was (virtual memory and memory paging schemes)
• Space complexity is still important in the field of
embedded computing (hand held computer based
equipment like cell phones, palm devices, etc)
11
11
Time Complexity
• Is the algorithm “fast enough” for my needs
• How much longer will the algorithm take if I
increase the amount of data it must process
• Given a set of algorithms that accomplish the
same thing, which is the right one to choose
Running Time of an Algorithm
• Depends upon
• Input Size
• Nature of Input
• Generally time grows with size of input, so
running time of an algorithm is usually
measured as function of input size.
• Running time is measured in terms of number
of steps/primitive operations performed
• Independent from machine, OS
Finding running time of an
Algorithm / Analyzing an Algorithm
• Running time is measured by number of
steps/primitive operations performed
• Steps means elementary operation like
– ,+, *,<, =, A[i] etc
• We will measure number of steps taken in
term of size of input
Simple Example
// Input: int A[N], array of N integers
// Output: Sum of all numbers in array A
int Sum(int A[], int N)
{
int s=0;
for (int i=0; i< N; i++)
s = s + A[i];
return s;
}
How should we analyse this?
Simple Example
// Input: int A[N], array of N integers
// Output: Sum of all numbers in array A
int Sum(int A[], int N){
int s=0;
for (int i=0; i< N; i++)
s = s + A[i];
return s;
}
1
2 3 4
5
6 7
8
1,2,8: Once
3,4,5,6,7: Once per each iteration
of for loop, N iteration
Total: 5N + 3
The complexity function of the
algorithm is : f(N) = 5N +3
Simple Example /Growth of 5n+3
Estimated running time for different values of N:
N = 10 => 53 steps
N = 100 => 503 steps
N = 1,000 => 5003 steps
N = 1,000,000 => 5,000,003 steps
As N grows, the number of steps grow in linear
proportion to N for this function “Sum”
What Dominates in Previous
Example?
What about the +3 and 5 in 5N+3?
– As N gets large, the +3 becomes insignificant
– 5 is inaccurate, as different operations require varying amounts
of time and also does not have any significant importance
What is fundamental is that the time is linear in N.
Asymptotic Complexity: As N gets large, concentrate on
the highest order term:
• Drop lower order terms such as +3
• Drop the constant coefficient of the highest order term
i.e. N
Comparing Functions: Asymptotic
Notation
• Big Oh Notation: Upper bound
• Omega Notation: Lower bound
• Theta Notation: Tighter bound
BIG OMEGA NOTATION
• If we wanted to say “running time is at least…” we
use Ω
• Big Omega notation, Ω, is used to express the lower
bounds on a function.
• If f(n) and g(n) are two complexity functions then we
can say:
f(n) is Ω(g(n)) if there exist
positive
numbers c and n0
such that 0<=f(n)>=cΩ(n)
for all n>=n0
5n+3=Ω(n)
BIG THETA NOTATION
• If we wish to express tight bounds we use the theta notation, Θ
• f(n) = Θ(g(n)) means that f(n) = O(c1g(n)) and f(n) = Ω(c2g(n))
20
WHAT DOES THIS ALL MEAN?
• If f(n) = Θ(g(n)) we say that f(n) and g(n)
grow at the same rate, asymptotically
• If f(n) = O(g(n)) and f(n) ≠ Ω(g(n)), then we
say that f(n) is asymptotically slower
growing than g(n).
• If f(n) = Ω(g(n)) and f(n) ≠ O(g(n)), then we
say that f(n) is asymptotically faster growing
than g(n).
21
WHICH NOTATION DO WE USE?
• To express the efficiency of our algorithms
which of the three notations should we use?
• As computer scientist we generally like to
express our algorithms as big O since we would
like to know the upper bounds of our algorithms.
• If we know the worse case then we can aim to
improve it and/or avoid it.
22
Big Oh Notation
If f(N) and g(N) are two complexity functions, we say
f(N) = O(g(N))
(read "f(N) is order g(N)", or "f(N) is big-O of g(N)")
if there are constants c and N0 such that for N > N0,
f(N) ≤ c * g(N)
for all sufficiently large N.
Big Oh Notation
• O(f(n)) =
{g(n) : there exists positive constants c and n0
such that 0 <= g(n) <= c f(n) }
• O(f(n)) is a set of functions.
• n = O(n2
) means that function n belongs to
the set of functions O(n2
)
Big-Oh Notation
• Even though it is correct to say “7n - 3 is
O(n3
)”, a better statement is “7n - 3 is O(n)”, that
is, one should make the approximation as tight
as possible
• Simple Rule:
Drop lower order terms and constant factors
7n-3 is O(n)
8n2
log n + 5n2
+ n is O(n2
log n)
Some Questions
3n2
- 100n + 6 = O(n2
)?
3n2
- 100n + 6 = O(n3
)?
3n2
- 100n + 6 = O(n)?
3n2
- 100n + 6 = (n2
)?
3n2
- 100n + 6 = (n3
)?
3n2
- 100n + 6 = (n)?
3n2
- 100n + 6 = (n2
)?
3n2
- 100n + 6 = (n3
)?
3n2
- 100n + 6 = (n)?
Performance Classification
f(n) Classification
1 Constant: run time is fixed, and does not depend upon n. Most instructions are
executed once, or only a few times, regardless of the amount of information being
processed
log n Logarithmic: when n increases, so does run time, but much slower. Common in
programs which solve large problems by transforming them into smaller problems.
n Linear: run time varies directly with n. Typically, a small amount of processing is
done on each element.
n log n When n doubles, run time slightly more than doubles. Common in programs which
break a problem down into smaller sub-problems, solves them independently, then
combines solutions
n2 Quadratic: when n doubles, runtime increases fourfold. Practical only for small
problems; typically the program processes all pairs of input (e.g. in a double nested
loop).
n3 Cubic: when n doubles, runtime increases eightfold
2n Exponential: when n doubles, run time squares. This is often the result of a natural,
“brute force” solution.
Size does matter
What happens if we double the input size N?
N log2N 5N N log2N N2
2N
8 3 40 24 64 256
16 4 80 64 256 65536
32 5 160 160 1024 ~109
64 6 320 384 4096 ~1019
128 7 640 896 16384 ~1038
256 8 1280 2048 65536 ~1076
29
COMPLEXITY CLASSES
Time
(steps)
29
Size does matter
• Suppose a program has run time O(n!) and the run
time for
n = 10 is 1 second
For n = 12, the run time is 2 minutes
For n = 14, the run time is 6 hours
For n = 16, the run time is 2 months
For n = 18, the run time is 50 years
For n = 20, the run time is 200 centuries
Standard Analysis Techniques
• Constant time statements
• Analyzing Loops
• Analyzing Nested Loops
• Analyzing Sequence of Statements
• Analyzing Conditional Statements
Constant time statements
• Simplest case: O(1) time statements
• Assignment statements of simple data types
int x = y;
• Arithmetic operations:
x = 5 * y + 4 - z;
• Array referencing:
A[j] = 5;
• Array assignment:
 j, A[j] = 5;
• Most conditional tests:
if (x < 12) ...
Analyzing Loops
• Any loop has two parts:
– How many iterations are performed?
– How many steps per iteration?
int sum = 0,j;
for (j=0; j < N; j++)
sum = sum +j;
– Loop executes N times (0..N-1)
– 4 = O(1) steps per iteration
• Total time is N * O(1) = O(N*1) = O(N)
34
ANALYZING LOOPS – LINEAR LOOPS
• Example (have a look at this code segment):
• Efficiency is proportional to the number of iterations.
• Efficiency time function is :
f(n) = 1 + (n-1) + c*(n-1) +( n-1)
= (c+2)*(n-1) + 1
= (c+2)n – (c+2) +1
• Asymptotically, efficiency is : O(n)
34
Analyzing Loops
• What about this for loop?
int sum =0, j;
for (j=0; j < 100; j++)
sum = sum +j;
• Loop executes 100 times
• 4 = O(1) steps per iteration
• Total time is 100 * O(1) = O(100 * 1) = O(100)
= O(1)
Analyzing Nested Loops
• Treat just like a single loop and evaluate each level of
nesting as needed:
int j,k;
for (j=0; j<N; j++)
for (k=N; k>0; k--)
sum += k+j;
• Start with outer loop:
– How many iterations? N
– How much time per iteration? Need to evaluate inner loop
• Inner loop uses O(N) time
• Total time is N * O(N) = O(N*N) = O(N2
)
37
HOW DID WE GET THIS ANSWER?
• When doing Big-O analysis, we sometimes have
to compute a series like: 1 + 2 + 3 + ... + (n-1) + n
• i.e. Sum of first n numbers. What is the
complexity of this?
• Gauss figured out that the sum of the first n
numbers is always:
37
Analyzing Sequence of Statements
• For a sequence of statements, compute their
complexity functions individually and add them
up
for (j=0; j < N; j++)
for (k =0; k < j; k++)
sum = sum + j*k;
for (l=0; l < N; l++)
sum = sum -l;
cout<<“Sum=”<<sum;
Total cost is O(N2
) + O(N) +O(1) = O(N2
)
SUM RULE
O(N2
)
O(N)
O(1)
Analyzing Conditional Statements
What about conditional statements such as
if (condition)
statement1;
else
statement2;
where statement1 runs in O(N) time and statement2 runs in O(N2) time?
We use "worst case" complexity: among all inputs of size N, that is the
maximum running time?
The analysis for the example above is O(N2
)
Best Case
• Best case is defined as which input of size n
is cheapest among all inputs of size n.
• “The best case for my algorithm is n=1
because that is the fastest.” WRONG!
Misunderstanding
Selecting a Data Structure
Select a data structure as follows:
1. Analyze the problem to determine the
resource constraints a solution must
meet.
2. Determine the basic operations that must
be supported. Quantify the resource
constraints for each operation.
3. Select the data structure that best meets
these requirements.
42
Data Type, Data Structure, and Abstract Data Types
• Data Type
– Set of values that the variable may assume
– E.g., boolean = {false, true}, digit = {0, 1, 2, …., 9}
• Abstract Data Type
– A mathematical model, together with various operations defined on the model
– Algorithms are designed in terms of ADTs and implemented in terms of the data types
and operators supported by the programming language
• Data Structures
– Physical implementation of an ADT
– Data structures used in implementations are provided in a language (primitive or built-in)
or are built from the language constructs (user-defined)
– Each operation associated with the ADT is implemented by one or more
subroutines in the implementation
Array
• An ordered set (sequence) with a fixed
number of elements, all of the same type,
where the basic operation is
direct access to each element in the array
so values can be retrieved from or stored
in this element.
44
Array representation
• [5, 2, 4, 8,1]
• Some of the implementations can be
1
8
4
2
5
location(i) = i
5
2
4
8
1
location(i) = 9- i
4
2
5
1
8
location(i) = (7+i)%10
Arrays
Properties:
– Ordered so there is a first element, a second one, etc.
– Fixed number of elements — fixed capacity
– Elements must be the same type (and size);
 use arrays only for homogeneous data sets.
– Direct access: Access an element by giving its location
• The time to access each element is the same for all elements,
regardless of position.
• in contrast to sequential access (where to access an element, one
must first access all those that precede it.)
Declaring Arrays in C++
where
element_type is any type
array_name is the name of the array — any valid identifier
CAPACITY (a positive integer constant) is the number of
elements in the array
score[0]
score[1]
score[2]
score[3]
score[99]
.
.
.
.
.
.
element_type array_name[CAPACITY];
e.g., double score[100];
The elements (or positions) of the array are
indexed 0, 1, 2, . . ., CAPACITY - 1.
The compiler reserves a block of “consecutive”
memory locations, enough to hold CAPACITY
values of type element_type.
an array literal
Array Initialization
Example:
double rate[5] = {0.11, 0.13, 0.16, 0.18, 0.21};
Note 1: If fewer values supplied than array's capacity, remaining
elements assigned 0.
double rate[5] = {0.11, 0.13, 0.16};
Note 2: It is an error if more values are supplied than the declared size of
the array.
How this error is handled, however, will vary from one compiler to
another.
rate
0 1 2 3 4
0.11 0.13 0.16 0 0
rate
0 1 2 3 4
0.11 0.13 0.16 0.18 0.21
In C++, arrays can be initialized when they are declared.
Numeric arrays:
element_type num_array[CAPACITY] = {list_of_initial_values};
Addresses
When an array is declared, the address of the first byte (or word) in
the block of memory associated with the array is called the base
address of the array.
Each array reference must be translated into an offset from this base
address.
For example, if each element of array score will be stored in 8 bytes
and the base address of score is 0x1396. A statement such as
cout << score[3] << endl;
requires that array reference
score[3]
be translated into a memory
address: 0x1396 + 3 * sizeof
(double)
= 0x1396 + 3 * 8
= 0x13ae
The contents of the memory word with this
address 0x13ae can then be retrieved and
displayed.
An address translation like this is carried out
each time an array element is accessed.
score[3] 
[0]
[1]
[2]
[3]
[99]
.
.
.
.
.
.
score  0x1396
0x13ae
What will be the
time complexity
Problems with Arrays
1. The capacity of Array can NOT change during
program execution.
What is the problem?
Memory wastage
Out of range errors
2. Arrays are NOT self contained objects
What is the problem?
No way to find the last value stored.
Not a self contained object as per OOP principles.
Dynamic Arrays
 You would like to use an array data structure
but you do not know the size of the array at
compile time.
 You find out when the program executes that
you need an integer array of size n=20.
 Allocate an array using the new operator:
int* y = new int[20]; // or int* y = new int[n]
y[0] = 10;
y[1] = 15; // use is the same
Dynamic Arrays
 ‘y’ is a lvalue; it is a pointer that holds the
address of 20 consecutive cells in memory.
 It can be assigned a value. The new operator
returns as address that is stored in y.
 We can write:
y = &x[0];
y = x; // x can appear on the right
// y gets the address of the
// first cell of the x array
Dynamic Arrays
 We must free the memory we got using the
new operator once we are done with the y
array.
delete[ ] y;
 We would not do this to the x array because we
did not use new to create it.
Multidimensional Arrays
Most high level languages support arrays with more than one
dimension.
2D arrays are useful when data has to be arranged in tabular
form.
Higher dimensional arrays appropriate when several
characteristics associated with data.
Test 1 Test 2 Test 3 Test 4
Student 1 99.0 93.5 89.0 91.0
Student 2 66.0 68.0 84.5 82.0
Student 3 88.5 78.5 70.0 65.0
: : : : :
: : : : :
Student-n 100.0 99.5 100.0 99.0
For storage and processing, use a two-dimensional
array.
Example: A table of test scores for several different
students on
several different tests.
Declaring Two-Dimensional Arrays
Standard form of declaration:
element_type array_name[NUM_ROWS][NUM_COLUMNS];
Example:
const int NUM_ROWS = 30,
NUM_COLUMNS = 4;
double scoresTable[NUM_ROWS][NUM_COLUMNS];
Initialization
 List the initial values in braces, row by row;
 May use internal braces for each row to improve
readability.
Example:
double rates[][] = {{0.50, 0.55, 0.53}, // first row
{0.63, 0.58, 0.55}}; // second row
[0]
[1]
[2]
[3]
[29]
[0] [[1] [2] [3]
Processing Two-Dimensional Arrays
 Remember: Rows (and) columns are numbered from zero!!
 Use doubly-indexed variables:
scoresTable[2][3] is the entry in row 2 and column
3

row index column index
 Use nested loops to vary the two indices, most often in a rowwise
manner.

More Related Content

Similar to Lecture 1 and 2 of Data Structures & Algorithms (20)

PPTX
Data Structures and Agorithm: DS 22 Analysis of Algorithm.pptx
RashidFaridChishti
 
PPT
Data Structure and Algorithms
ManishPrajapati78
 
PPT
Basics of data structure types of data structures
kavita20193
 
PDF
Data Structures Notes
RobinRohit2
 
PPTX
9. Asymptotic Analysizbbsbsbsbshzhsbbss.pptx
azharkhanofficial345
 
PPTX
Algorithm Complexity and Main Concepts
Adelina Ahadova
 
PPTX
Computational Complexity.pptx
EnosSalar
 
PPTX
Data structures notes for college students btech.pptx
KarthikVijay59
 
PPTX
Algorithm for the DAA agscsnak javausmagagah
RaviPandey598038
 
PPTX
Searching Algorithms
Afaq Mansoor Khan
 
PPTX
1_Asymptotic_Notation_pptx.pptx
pallavidhade2
 
PDF
Data Structure & Algorithms - Mathematical
babuk110
 
PPTX
Module-1.pptxbdjdhcdbejdjhdbchchchchchjcjcjc
shashashashashank
 
PPTX
Intro to super. advance algorithm..pptx
ManishBaranwal10
 
PPTX
BCSE202Lkkljkljkbbbnbnghghjghghghghghghghgh
shivapatil54
 
PPTX
TIME EXECUTION OF DIFFERENT SORTED ALGORITHMS
Tanya Makkar
 
PDF
Algorithm analysis
Budditha Hettige
 
PDF
ESINF03-AlgAnalis.pdfESINF03-AlgAnalis.pdf
LusArajo20
 
PDF
Algorithm Analysis.pdf
MemMem25
 
Data Structures and Agorithm: DS 22 Analysis of Algorithm.pptx
RashidFaridChishti
 
Data Structure and Algorithms
ManishPrajapati78
 
Basics of data structure types of data structures
kavita20193
 
Data Structures Notes
RobinRohit2
 
9. Asymptotic Analysizbbsbsbsbshzhsbbss.pptx
azharkhanofficial345
 
Algorithm Complexity and Main Concepts
Adelina Ahadova
 
Computational Complexity.pptx
EnosSalar
 
Data structures notes for college students btech.pptx
KarthikVijay59
 
Algorithm for the DAA agscsnak javausmagagah
RaviPandey598038
 
Searching Algorithms
Afaq Mansoor Khan
 
1_Asymptotic_Notation_pptx.pptx
pallavidhade2
 
Data Structure & Algorithms - Mathematical
babuk110
 
Module-1.pptxbdjdhcdbejdjhdbchchchchchjcjcjc
shashashashashank
 
Intro to super. advance algorithm..pptx
ManishBaranwal10
 
BCSE202Lkkljkljkbbbnbnghghjghghghghghghghgh
shivapatil54
 
TIME EXECUTION OF DIFFERENT SORTED ALGORITHMS
Tanya Makkar
 
Algorithm analysis
Budditha Hettige
 
ESINF03-AlgAnalis.pdfESINF03-AlgAnalis.pdf
LusArajo20
 
Algorithm Analysis.pdf
MemMem25
 

More from haseebanjum2611 (9)

PPTX
Blue and White Geometric Company Profile Presentation.pptx
haseebanjum2611
 
PPTX
Black and Gray Gradient Professional Presentation.pptx
haseebanjum2611
 
PPTX
Software Evaluation Presentation of Software Engineering.pptx
haseebanjum2611
 
PPTX
DSA Presentation of Data Structures and Algorithms.pptx
haseebanjum2611
 
PPT
Lec-6 Recursion of Data Structures & Algorithms
haseebanjum2611
 
PPT
Lec 6 Divide and conquer of Data Structures & Algortihms
haseebanjum2611
 
PPT
Lecture 3 List of Data Structures & Algorithms
haseebanjum2611
 
PPT
Lec 4 Stack of Data Structures & Algorithms
haseebanjum2611
 
PPT
Graphs Presentation of University by Coordinator
haseebanjum2611
 
Blue and White Geometric Company Profile Presentation.pptx
haseebanjum2611
 
Black and Gray Gradient Professional Presentation.pptx
haseebanjum2611
 
Software Evaluation Presentation of Software Engineering.pptx
haseebanjum2611
 
DSA Presentation of Data Structures and Algorithms.pptx
haseebanjum2611
 
Lec-6 Recursion of Data Structures & Algorithms
haseebanjum2611
 
Lec 6 Divide and conquer of Data Structures & Algortihms
haseebanjum2611
 
Lecture 3 List of Data Structures & Algorithms
haseebanjum2611
 
Lec 4 Stack of Data Structures & Algorithms
haseebanjum2611
 
Graphs Presentation of University by Coordinator
haseebanjum2611
 
Ad

Recently uploaded (20)

PPTX
How to Setup Automatic Reordering Rule in Odoo 18 Inventory
Celine George
 
PPTX
How to Configure Refusal of Applicants in Odoo 18 Recruitment
Celine George
 
PDF
Public Health For The 21st Century 1st Edition Judy Orme Jane Powell
trjnesjnqg7801
 
PPTX
Tanja Vujicic - PISA for Schools contact Info
EduSkills OECD
 
PPTX
Urban Hierarchy and Service Provisions.pptx
Islamic University of Bangladesh
 
PDF
COM and NET Component Services 1st Edition Juval Löwy
kboqcyuw976
 
PPTX
How to Create & Manage Stages in Odoo 18 Helpdesk
Celine George
 
PDF
DIGESTION OF CARBOHYDRATES ,PROTEINS AND LIPIDS
raviralanaresh2
 
DOCX
DLL english grade five goof for one week
FlordelynGonzales1
 
PDF
Romanticism in Love and Sacrifice An Analysis of Oscar Wilde’s The Nightingal...
KaryanaTantri21
 
PPTX
ENGLISH -PPT- Week1 Quarter1 -day-1.pptx
garcialhavz
 
PPT
M&A5 Q1 1 differentiate evolving early Philippine conventional and contempora...
ErlizaRosete
 
PPTX
How to use _name_search() method in Odoo 18
Celine George
 
PPTX
How to Manage Wins & Losses in Odoo 18 CRM
Celine George
 
PPTX
How to Add New Item in CogMenu in Odoo 18
Celine George
 
PPTX
Comparing Translational and Rotational Motion.pptx
AngeliqueTolentinoDe
 
PDF
Lesson 1 : Science and the Art of Geography Ecosystem
marvinnbustamante1
 
PPTX
Photo chemistry Power Point Presentation
mprpgcwa2024
 
PDF
Rapid Mathematics Assessment Score sheet for all Grade levels
DessaCletSantos
 
PPTX
A Case of Identity A Sociological Approach Fix.pptx
Ismail868386
 
How to Setup Automatic Reordering Rule in Odoo 18 Inventory
Celine George
 
How to Configure Refusal of Applicants in Odoo 18 Recruitment
Celine George
 
Public Health For The 21st Century 1st Edition Judy Orme Jane Powell
trjnesjnqg7801
 
Tanja Vujicic - PISA for Schools contact Info
EduSkills OECD
 
Urban Hierarchy and Service Provisions.pptx
Islamic University of Bangladesh
 
COM and NET Component Services 1st Edition Juval Löwy
kboqcyuw976
 
How to Create & Manage Stages in Odoo 18 Helpdesk
Celine George
 
DIGESTION OF CARBOHYDRATES ,PROTEINS AND LIPIDS
raviralanaresh2
 
DLL english grade five goof for one week
FlordelynGonzales1
 
Romanticism in Love and Sacrifice An Analysis of Oscar Wilde’s The Nightingal...
KaryanaTantri21
 
ENGLISH -PPT- Week1 Quarter1 -day-1.pptx
garcialhavz
 
M&A5 Q1 1 differentiate evolving early Philippine conventional and contempora...
ErlizaRosete
 
How to use _name_search() method in Odoo 18
Celine George
 
How to Manage Wins & Losses in Odoo 18 CRM
Celine George
 
How to Add New Item in CogMenu in Odoo 18
Celine George
 
Comparing Translational and Rotational Motion.pptx
AngeliqueTolentinoDe
 
Lesson 1 : Science and the Art of Geography Ecosystem
marvinnbustamante1
 
Photo chemistry Power Point Presentation
mprpgcwa2024
 
Rapid Mathematics Assessment Score sheet for all Grade levels
DessaCletSantos
 
A Case of Identity A Sociological Approach Fix.pptx
Ismail868386
 
Ad

Lecture 1 and 2 of Data Structures & Algorithms

  • 2. 2 What is a Computer Program? • To exactly know, what is data structure? We must know: – What is a computer program? Input Some mysterious processing Output
  • 3. Data Structures  Prepares the students for the more advanced material students will encounter in later courses.  Cover well-known data structures such as dynamic arrays, linked lists, stacks, queues, tree and graphs.  Implement data structures in C++
  • 4. 4 Example • Data structure for storing data of students:- – Arrays – Linked Lists • Issues – Space needed – Operations efficiency (Time required to complete operations) • Retrieval • Insertion • Deletion – Frequency of usage of above operations
  • 5. 5 What data structure to use? Data structures let the input and output be represented in a way that can be handled efficiently and effectively. array Linked list tree queue stack
  • 6. Organizing Data  Any organization for a collection of records that can be searched, processed in any order, or modified.  The choice of data structure and algorithm can make the difference between a program running in a few seconds or many days.
  • 7. 7 7 What’s the difference • Different types of values • Different structures – No structure – just a collection of values – Linear structure of values – the order matters – Set of key-value pairs – Hierarchical structures – Grid/table – …. • Different access disciplines – get, put, remove anywhere – get, put, remove only at the ends, or only at the top, or … – get, put, remove by position, or by value, or by key, or … – ….
  • 8. 8 8 Good Algorithms? • Run in less time • Consume less memory But computational resources (time complexity) is usually more important
  • 9. 9 9 Complexity • In examining algorithm efficiency we must understand the idea of complexity • Complexity is the consumptions of resources. • Most important aspect of complexity are – Space complexity – Time Complexity
  • 10. 10 10 Space Complexity • When memory was expensive we focused on making programs as space efficient as possible and developed schemes to make memory appear larger than it really was (virtual memory and memory paging schemes) • Space complexity is still important in the field of embedded computing (hand held computer based equipment like cell phones, palm devices, etc)
  • 11. 11 11 Time Complexity • Is the algorithm “fast enough” for my needs • How much longer will the algorithm take if I increase the amount of data it must process • Given a set of algorithms that accomplish the same thing, which is the right one to choose
  • 12. Running Time of an Algorithm • Depends upon • Input Size • Nature of Input • Generally time grows with size of input, so running time of an algorithm is usually measured as function of input size. • Running time is measured in terms of number of steps/primitive operations performed • Independent from machine, OS
  • 13. Finding running time of an Algorithm / Analyzing an Algorithm • Running time is measured by number of steps/primitive operations performed • Steps means elementary operation like – ,+, *,<, =, A[i] etc • We will measure number of steps taken in term of size of input
  • 14. Simple Example // Input: int A[N], array of N integers // Output: Sum of all numbers in array A int Sum(int A[], int N) { int s=0; for (int i=0; i< N; i++) s = s + A[i]; return s; } How should we analyse this?
  • 15. Simple Example // Input: int A[N], array of N integers // Output: Sum of all numbers in array A int Sum(int A[], int N){ int s=0; for (int i=0; i< N; i++) s = s + A[i]; return s; } 1 2 3 4 5 6 7 8 1,2,8: Once 3,4,5,6,7: Once per each iteration of for loop, N iteration Total: 5N + 3 The complexity function of the algorithm is : f(N) = 5N +3
  • 16. Simple Example /Growth of 5n+3 Estimated running time for different values of N: N = 10 => 53 steps N = 100 => 503 steps N = 1,000 => 5003 steps N = 1,000,000 => 5,000,003 steps As N grows, the number of steps grow in linear proportion to N for this function “Sum”
  • 17. What Dominates in Previous Example? What about the +3 and 5 in 5N+3? – As N gets large, the +3 becomes insignificant – 5 is inaccurate, as different operations require varying amounts of time and also does not have any significant importance What is fundamental is that the time is linear in N. Asymptotic Complexity: As N gets large, concentrate on the highest order term: • Drop lower order terms such as +3 • Drop the constant coefficient of the highest order term i.e. N
  • 18. Comparing Functions: Asymptotic Notation • Big Oh Notation: Upper bound • Omega Notation: Lower bound • Theta Notation: Tighter bound
  • 19. BIG OMEGA NOTATION • If we wanted to say “running time is at least…” we use Ω • Big Omega notation, Ω, is used to express the lower bounds on a function. • If f(n) and g(n) are two complexity functions then we can say: f(n) is Ω(g(n)) if there exist positive numbers c and n0 such that 0<=f(n)>=cΩ(n) for all n>=n0 5n+3=Ω(n)
  • 20. BIG THETA NOTATION • If we wish to express tight bounds we use the theta notation, Θ • f(n) = Θ(g(n)) means that f(n) = O(c1g(n)) and f(n) = Ω(c2g(n)) 20
  • 21. WHAT DOES THIS ALL MEAN? • If f(n) = Θ(g(n)) we say that f(n) and g(n) grow at the same rate, asymptotically • If f(n) = O(g(n)) and f(n) ≠ Ω(g(n)), then we say that f(n) is asymptotically slower growing than g(n). • If f(n) = Ω(g(n)) and f(n) ≠ O(g(n)), then we say that f(n) is asymptotically faster growing than g(n). 21
  • 22. WHICH NOTATION DO WE USE? • To express the efficiency of our algorithms which of the three notations should we use? • As computer scientist we generally like to express our algorithms as big O since we would like to know the upper bounds of our algorithms. • If we know the worse case then we can aim to improve it and/or avoid it. 22
  • 23. Big Oh Notation If f(N) and g(N) are two complexity functions, we say f(N) = O(g(N)) (read "f(N) is order g(N)", or "f(N) is big-O of g(N)") if there are constants c and N0 such that for N > N0, f(N) ≤ c * g(N) for all sufficiently large N.
  • 24. Big Oh Notation • O(f(n)) = {g(n) : there exists positive constants c and n0 such that 0 <= g(n) <= c f(n) } • O(f(n)) is a set of functions. • n = O(n2 ) means that function n belongs to the set of functions O(n2 )
  • 25. Big-Oh Notation • Even though it is correct to say “7n - 3 is O(n3 )”, a better statement is “7n - 3 is O(n)”, that is, one should make the approximation as tight as possible • Simple Rule: Drop lower order terms and constant factors 7n-3 is O(n) 8n2 log n + 5n2 + n is O(n2 log n)
  • 26. Some Questions 3n2 - 100n + 6 = O(n2 )? 3n2 - 100n + 6 = O(n3 )? 3n2 - 100n + 6 = O(n)? 3n2 - 100n + 6 = (n2 )? 3n2 - 100n + 6 = (n3 )? 3n2 - 100n + 6 = (n)? 3n2 - 100n + 6 = (n2 )? 3n2 - 100n + 6 = (n3 )? 3n2 - 100n + 6 = (n)?
  • 27. Performance Classification f(n) Classification 1 Constant: run time is fixed, and does not depend upon n. Most instructions are executed once, or only a few times, regardless of the amount of information being processed log n Logarithmic: when n increases, so does run time, but much slower. Common in programs which solve large problems by transforming them into smaller problems. n Linear: run time varies directly with n. Typically, a small amount of processing is done on each element. n log n When n doubles, run time slightly more than doubles. Common in programs which break a problem down into smaller sub-problems, solves them independently, then combines solutions n2 Quadratic: when n doubles, runtime increases fourfold. Practical only for small problems; typically the program processes all pairs of input (e.g. in a double nested loop). n3 Cubic: when n doubles, runtime increases eightfold 2n Exponential: when n doubles, run time squares. This is often the result of a natural, “brute force” solution.
  • 28. Size does matter What happens if we double the input size N? N log2N 5N N log2N N2 2N 8 3 40 24 64 256 16 4 80 64 256 65536 32 5 160 160 1024 ~109 64 6 320 384 4096 ~1019 128 7 640 896 16384 ~1038 256 8 1280 2048 65536 ~1076
  • 30. Size does matter • Suppose a program has run time O(n!) and the run time for n = 10 is 1 second For n = 12, the run time is 2 minutes For n = 14, the run time is 6 hours For n = 16, the run time is 2 months For n = 18, the run time is 50 years For n = 20, the run time is 200 centuries
  • 31. Standard Analysis Techniques • Constant time statements • Analyzing Loops • Analyzing Nested Loops • Analyzing Sequence of Statements • Analyzing Conditional Statements
  • 32. Constant time statements • Simplest case: O(1) time statements • Assignment statements of simple data types int x = y; • Arithmetic operations: x = 5 * y + 4 - z; • Array referencing: A[j] = 5; • Array assignment:  j, A[j] = 5; • Most conditional tests: if (x < 12) ...
  • 33. Analyzing Loops • Any loop has two parts: – How many iterations are performed? – How many steps per iteration? int sum = 0,j; for (j=0; j < N; j++) sum = sum +j; – Loop executes N times (0..N-1) – 4 = O(1) steps per iteration • Total time is N * O(1) = O(N*1) = O(N)
  • 34. 34 ANALYZING LOOPS – LINEAR LOOPS • Example (have a look at this code segment): • Efficiency is proportional to the number of iterations. • Efficiency time function is : f(n) = 1 + (n-1) + c*(n-1) +( n-1) = (c+2)*(n-1) + 1 = (c+2)n – (c+2) +1 • Asymptotically, efficiency is : O(n) 34
  • 35. Analyzing Loops • What about this for loop? int sum =0, j; for (j=0; j < 100; j++) sum = sum +j; • Loop executes 100 times • 4 = O(1) steps per iteration • Total time is 100 * O(1) = O(100 * 1) = O(100) = O(1)
  • 36. Analyzing Nested Loops • Treat just like a single loop and evaluate each level of nesting as needed: int j,k; for (j=0; j<N; j++) for (k=N; k>0; k--) sum += k+j; • Start with outer loop: – How many iterations? N – How much time per iteration? Need to evaluate inner loop • Inner loop uses O(N) time • Total time is N * O(N) = O(N*N) = O(N2 )
  • 37. 37 HOW DID WE GET THIS ANSWER? • When doing Big-O analysis, we sometimes have to compute a series like: 1 + 2 + 3 + ... + (n-1) + n • i.e. Sum of first n numbers. What is the complexity of this? • Gauss figured out that the sum of the first n numbers is always: 37
  • 38. Analyzing Sequence of Statements • For a sequence of statements, compute their complexity functions individually and add them up for (j=0; j < N; j++) for (k =0; k < j; k++) sum = sum + j*k; for (l=0; l < N; l++) sum = sum -l; cout<<“Sum=”<<sum; Total cost is O(N2 ) + O(N) +O(1) = O(N2 ) SUM RULE O(N2 ) O(N) O(1)
  • 39. Analyzing Conditional Statements What about conditional statements such as if (condition) statement1; else statement2; where statement1 runs in O(N) time and statement2 runs in O(N2) time? We use "worst case" complexity: among all inputs of size N, that is the maximum running time? The analysis for the example above is O(N2 )
  • 40. Best Case • Best case is defined as which input of size n is cheapest among all inputs of size n. • “The best case for my algorithm is n=1 because that is the fastest.” WRONG! Misunderstanding
  • 41. Selecting a Data Structure Select a data structure as follows: 1. Analyze the problem to determine the resource constraints a solution must meet. 2. Determine the basic operations that must be supported. Quantify the resource constraints for each operation. 3. Select the data structure that best meets these requirements.
  • 42. 42 Data Type, Data Structure, and Abstract Data Types • Data Type – Set of values that the variable may assume – E.g., boolean = {false, true}, digit = {0, 1, 2, …., 9} • Abstract Data Type – A mathematical model, together with various operations defined on the model – Algorithms are designed in terms of ADTs and implemented in terms of the data types and operators supported by the programming language • Data Structures – Physical implementation of an ADT – Data structures used in implementations are provided in a language (primitive or built-in) or are built from the language constructs (user-defined) – Each operation associated with the ADT is implemented by one or more subroutines in the implementation
  • 43. Array • An ordered set (sequence) with a fixed number of elements, all of the same type, where the basic operation is direct access to each element in the array so values can be retrieved from or stored in this element.
  • 44. 44 Array representation • [5, 2, 4, 8,1] • Some of the implementations can be 1 8 4 2 5 location(i) = i 5 2 4 8 1 location(i) = 9- i 4 2 5 1 8 location(i) = (7+i)%10
  • 45. Arrays Properties: – Ordered so there is a first element, a second one, etc. – Fixed number of elements — fixed capacity – Elements must be the same type (and size);  use arrays only for homogeneous data sets. – Direct access: Access an element by giving its location • The time to access each element is the same for all elements, regardless of position. • in contrast to sequential access (where to access an element, one must first access all those that precede it.)
  • 46. Declaring Arrays in C++ where element_type is any type array_name is the name of the array — any valid identifier CAPACITY (a positive integer constant) is the number of elements in the array score[0] score[1] score[2] score[3] score[99] . . . . . . element_type array_name[CAPACITY]; e.g., double score[100]; The elements (or positions) of the array are indexed 0, 1, 2, . . ., CAPACITY - 1. The compiler reserves a block of “consecutive” memory locations, enough to hold CAPACITY values of type element_type.
  • 47. an array literal Array Initialization Example: double rate[5] = {0.11, 0.13, 0.16, 0.18, 0.21}; Note 1: If fewer values supplied than array's capacity, remaining elements assigned 0. double rate[5] = {0.11, 0.13, 0.16}; Note 2: It is an error if more values are supplied than the declared size of the array. How this error is handled, however, will vary from one compiler to another. rate 0 1 2 3 4 0.11 0.13 0.16 0 0 rate 0 1 2 3 4 0.11 0.13 0.16 0.18 0.21 In C++, arrays can be initialized when they are declared. Numeric arrays: element_type num_array[CAPACITY] = {list_of_initial_values};
  • 48. Addresses When an array is declared, the address of the first byte (or word) in the block of memory associated with the array is called the base address of the array. Each array reference must be translated into an offset from this base address. For example, if each element of array score will be stored in 8 bytes and the base address of score is 0x1396. A statement such as cout << score[3] << endl; requires that array reference score[3] be translated into a memory address: 0x1396 + 3 * sizeof (double) = 0x1396 + 3 * 8 = 0x13ae The contents of the memory word with this address 0x13ae can then be retrieved and displayed. An address translation like this is carried out each time an array element is accessed. score[3]  [0] [1] [2] [3] [99] . . . . . . score  0x1396 0x13ae What will be the time complexity
  • 49. Problems with Arrays 1. The capacity of Array can NOT change during program execution. What is the problem? Memory wastage Out of range errors 2. Arrays are NOT self contained objects What is the problem? No way to find the last value stored. Not a self contained object as per OOP principles.
  • 50. Dynamic Arrays  You would like to use an array data structure but you do not know the size of the array at compile time.  You find out when the program executes that you need an integer array of size n=20.  Allocate an array using the new operator: int* y = new int[20]; // or int* y = new int[n] y[0] = 10; y[1] = 15; // use is the same
  • 51. Dynamic Arrays  ‘y’ is a lvalue; it is a pointer that holds the address of 20 consecutive cells in memory.  It can be assigned a value. The new operator returns as address that is stored in y.  We can write: y = &x[0]; y = x; // x can appear on the right // y gets the address of the // first cell of the x array
  • 52. Dynamic Arrays  We must free the memory we got using the new operator once we are done with the y array. delete[ ] y;  We would not do this to the x array because we did not use new to create it.
  • 53. Multidimensional Arrays Most high level languages support arrays with more than one dimension. 2D arrays are useful when data has to be arranged in tabular form. Higher dimensional arrays appropriate when several characteristics associated with data. Test 1 Test 2 Test 3 Test 4 Student 1 99.0 93.5 89.0 91.0 Student 2 66.0 68.0 84.5 82.0 Student 3 88.5 78.5 70.0 65.0 : : : : : : : : : : Student-n 100.0 99.5 100.0 99.0 For storage and processing, use a two-dimensional array. Example: A table of test scores for several different students on several different tests.
  • 54. Declaring Two-Dimensional Arrays Standard form of declaration: element_type array_name[NUM_ROWS][NUM_COLUMNS]; Example: const int NUM_ROWS = 30, NUM_COLUMNS = 4; double scoresTable[NUM_ROWS][NUM_COLUMNS]; Initialization  List the initial values in braces, row by row;  May use internal braces for each row to improve readability. Example: double rates[][] = {{0.50, 0.55, 0.53}, // first row {0.63, 0.58, 0.55}}; // second row [0] [1] [2] [3] [29] [0] [[1] [2] [3]
  • 55. Processing Two-Dimensional Arrays  Remember: Rows (and) columns are numbered from zero!!  Use doubly-indexed variables: scoresTable[2][3] is the entry in row 2 and column 3  row index column index  Use nested loops to vary the two indices, most often in a rowwise manner.

Editor's Notes

  • #9: Visit: tshahab.blogspot.com
  • #10: Visit: tshahab.blogspot.com
  • #15: Visit: tshahab.blogspot.com
  • #16: Visit: tshahab.blogspot.com
  • #21: Visit: tshahab.blogspot.com
  • #44: Visit: tshahab.blogspot.com
  • #48: Visit: tshahab.blogspot.com