AI
It would be a computer system which can think intelligently and make
decisions independently
ML
Subset of AI
Creates artificially intelligent system by using historical data
Types
Supervised
Unsupervised
Semi supervised
Reinforcement
SUPERVISED
Consider both input and output attributes of data set
Data set divided into
Training set
Test set
Training set , test set – create ML model , accuracy of model tested using test
set
2 types of supervised
Classification - work with data sets where output value is categorical
Regression – implementation for data sets where output data is
numerical
UNSUPERVISED :
Do no have output attribute in data set
Objective – use input attribute to identify relationships between data points
and mark into group based on similarity and dissimilarity measure
SEMI SUPERVISED:
Combination of supervised and unsupervised learning
We use this when we have few data points are classified while majority are
not
REINFORCEMENT:
Determine the course of actions or policy which needs to be taken by the
environment to achieve highest possible rewards
3 major components:
Agent - learner or decision-maker
Environment – includes everything that agent interacts with
Actions - what the agent can do
PYTHON
Interpreted – program executed line by line with no compilation step
Highly portable and runs on multiple platforms
Code – compiled to intermediate bytecode – then interpreted
Programmer – does not see bytecode file – that’s why python does not look
like compiled language
Dynamically typed language – means type of variable in python is decided at
runtime
Don’t declare variables here
SCRIPTING LANGUAGE:
Does not have compilation step
Interpreted directly
JAVA VS PYTHON:
Java – statically typed high level language
Compiled to create bytecode , runs on JVM
Python - dynamically typed high level language(scripting)
Has hidden compilation step and it is interpreted
Multi paradigm language which also provides object oriented features
Anaconda:
Open source for python and R language
Extensively used in data science, ml and deep learning related applications
JUPYTER NOTEBOOK:
Interactive development tools well suited for data science projects
Displays code and output in single docs comprised of cells
Uses ipynb files – which are text files with their content stored in JSON
format
KERNEL and CELLS:
Kernel – computational engine – that executes code contained in a notebook
Cell – container for code to execute by the notebooks kernel
DATA TYPES:
Dynamically typed lang
Variables not declared
Data types assigned at runtime
Strongly typed lang
Variable types – when assigned – cannot change implicitly
Data types – mutable or immutable
Immutable data types – assigned values cannot change - eg: numbers,
strings and tuples
Mutable – eg: list, set and dictionaries
List – ordered , tuples – ordered , sets – unordered , dictionaries – unordered
where key is unique and immutable within particular dictionaries
NUMBERS – immutable
Python supports different numerical types:
Int ( signed integer)
Float - accurate to 15 decimal points
Complex - 3+2j ( 3 and 2 are decimal and j -represents square root of -1
Type function – display datatype of an object
Instance function – check if data type of an object matches the specified data
type
No character data type in python
String :
Immutable
Operator:
[]
[:]
Membership operator – not in
And in
Not in – returns true if character does not exist in the given string
In – does the opposite
Eg: ‘M’ not in a
REPLACE FUNCTION:
Newstring = oldstring.replace(‘python’ , ‘oracle’)
--- it replaces a substring with another string
Split – split the string into substring based on the specified operator
LIST:
Mutable ordered collection of elements
[]
Heterogeneous, specify by indexing
Range of elements indexed by using colon operator within [ ] - slicing
Additional elements added to end by using append() function
TUPLE: ( values cannot be changed)
Immutable sequential collection
Heterogeneous
Specify by indexing
Concatenated by using + operator
SET:
Mutable unordered collection of unique elements
Values – display same elements but in different order each time
Common set operations performed with python sets – union, intersection ,
difference and symmetric differences
Sets are unordered – so they do not support indexing
Individual elements of a set cannot be accessed with an index and order of
set elements are not consistent for each display operation
DICTIONARY:
Mutable unordered collection of key value pairs
Mylist.insert(5,15) -- we try to insert the element 15 in the 5 th index . it will
check if 5th index is there in the list and if it is not there then it will inserted
at the end of the list
If it is there then it sit to 5th and old 5th index value is shifted to right
CONTROL STRUCTURES:
Block of code in python is indicated by level of – indentation
Python interprets non zero value as true
None and 0 – interpreted as false
Range() – function does not store the values in memory
While loop :
Entry controlled loop – because it will only happen if the condition evaluates
to true at least once
Pass:
It is a null statement and used as a placeholder for future code
Use this when we want loop to run but no operation to be performed in the
body
FUNCTION:
PASS BY REFERENCE:
Function arguments in python are passed by reference
Changes to the object inside the function are reflected to same objects,
which were passed as function arguments, outside the function
** reference to the object was passed to the object rather than a copy of the
object itself
Different types of arguments:
1. Required args - must have same args as specified in function definition
, same order
If not in same order it might no throw error but cant perform as expected
2. Default- by using = operator
During function invocation if args is not passed then default args will be used
3. Keyword – passed with name so order doesn’t match
4. Variable length argument – if you don’t want to pass a fixed number of
arguments in multiple function calls ( by specifying (a*) in the
function
**kwargs
Combines the words of keyword or named arguments and
variable length arguments – treated as key value pairs
In this the function call passes the key and value
SCOPE OF VARIABLE:
Global – variables that defined outside any function or class, they accessed
anywhere within the program
Local – variables defined within function or class
Note – local variables destroyed after function terminates , global variables
exist throughout the life of program
When local and global variables have same name then local variables
overrides the global variables and then the function terminates then local
variables is no longer exist and then global variables are executed
Lamda function:
anonymous function
No name
Have only singe expressions but have multiple arguments
Use lamda function in conjunction with filter function
Filter – assign the every value of x to list2 which the function returns true
Map func - assign all items of x after the function is executed
Shallow copy – [ : ] --it will retrieve the copy of your list
Mylist[slice(0,5,2)] --start,stop,step ( it will skip 2 elements means
iterating 2)
Step – how many elements should be skipped between slices
SHALLOW COPY:
Copy a list using assignment operator leads to both list referring to same
data
So changes made to one – will reflect in other also
To avoid – we perform shallow copy by using copy () function
Eg:
Mylist=[[3,3],[4,6]]
Newlist=mylist.copy()
However , elements within nested list are changed – that reflect in original
list also
Because , the nested list are copied by reference when performing a shallow
copy
To avoid this , we use deep copy
METHODS OF LIST:
1. Cmp() - compares elements in 2 list
It return 1 if a>b
Return -1 if a <b
Return 0 if a=b
2. Len() - length of list
3. Max()
4. Min()
5. List(sequence/collection) - convert sequence object or collection
into list
6. Append() - add to end of list
7. Insert() - insert at specified index
8. Remove() – remove the first instance of specified element
9. Pop() - remove at specified index
10. Clear() – remove all
11. Index() – return index of first occurrence
12. Count() – no of occurrence
13. Sort() – sort in ascending order
14. Reverse() – reverse the order
15. Copy() – creates a shallow copy , changes that applied to the
copy of the object should not reflect in original copy except when the
changes are applied to the elements of nested objects
16. Deepcopy() - creates a deep copy where changes to elements in
nested objects are also not reflected in the copy of the list
SLICING:
Slice by : operator
Slice by slice() function
Eg: slice[0,5,2] -- it will retrieve the value from 0 to 4 and step is 2 means it
retrieve the next element like( oracle) output: (o,a,l) steps – no of steps
taken for each element
Using list as stack :
Lifo
Append() – push an items to top of stack
Pop() – remove from top
List as queue:
Fifo
Dequeue , enqueue
Dequeue – double ended queue
Queue,dequeue, enqueue – performed on both sides
List comprehension:
Testlist=[“abc”,3]
List=[“string” if isinstance(I,str) else “integer” for I in testlist]
List
Output:
[‘string’,’integer’]
Creating shallow copy using : operator without specifying start and end index
Listcopy=mylist[:]
Listcopy --for printing the copied value
Shallow copy by using copy() function:
Mylist=[ ]
Newlist=mylist.copy() --change in new
list cannot reflect in old copy in shallow copy
Tuple is immutable if you want to perform some changes then you convert
into the list and then to do it:
Tuple=list(newtuple)
Remove – we gave the value directly
Pop – we gave index position to remove particular element
Myqueue.appendleft(“mango”) - it add mango to the first element
in the list
Enqueue to the end of the list – append(“guaua”)
Dequeue an element from end of – using pop() function
dequeue an element from beginning – use poleft()
del x[10]
del x[2:4]
del x[ : ] -- delete the entire list
tuples:
immutable
tuples inside nested list only changed
individual elements cant be deleted due to its nature
entire tuple only deleted using del
membership operator: in and not in
eg: print(44 in data)
print(56 not in data)
membership operator can also be used with strings and lists
enumerate function: --display both index and element
to add a count to the tuple
eg:
a=( )
for I,j in enumerate(a):
print(I,j)
out:
0 10
1 20
2 30
Extending:
By * operator to extending or replicated
T=(3,4)
Print(t*2)
Out: (3,4,3,4) --it can be replicated to 2 times
Functions used here:
Len()
Count()
Min()
Max()
Tuple(sequence) : converts the sequence into tuple
List inside the tuple – it should not replace because the list is part of tuple
element but content of list can be changed
Tuple add by + operator
Zip function: --perform parallel iteration\
Eg:
Tup1=
Tup2=
Res=all(a<b for a,b in zip(tup1,tup2) --return true if comparison pass
Str(res)
SET:
Not perform indexing
Unordered and stored randomly
Duplicate values are discarded
Changing elements done
Changing elements is not possible using indexing or slicing
Add() – add a single element to set
Update() – add multiple elements to set
Update() – allows addition of tuples,lists,strings or other set to the set
3 methods to remove elements in set:
discard() – does not throw error even if the item is not available in set,
it again used to remove a previously removed element
remove() – raise an error if the item is not available
common methods in set:
union() – return all from both sets
intersection() – common from both
difference() – return from first set but not in second
symmetric_difference() – return either the first or second but not the
elements common to two, it is opposite to intersection
intersection_update() - update the elements in set with the
intersection of set itself
difference_update() - remove all elements of the second set from the
current set
copy() – copy all elements into new set
isdisjoint() – return true if two sets have no common elements
issubset() - return true if all elements of set x are present in another
set y
issuperset() – return true if all elements of set x occupy set y and
return false if all elements of y are not present in x
all the elements of x is in y
y.superset(x)
pop() – removes and returns a random element from the set
clear() – remove all elements
DICTIONARY:
More than one entry per key is not allowed
When duplicate come with assignment then during last assignment is
considered
Do not support slicing
Indexed by their keys
Functions in dict:
COPY()
Fromkeys() – create dictionary using key
Get() – return the value
Items() – return the list of key value pairs as tuple pairs
Keys() – return the list of dictionary keys
Values() - return the list of values
Update()
Pop() – remove and return
REGUALAR EXPRESSION:
Are special characters built in the form of a sequence, defined for
pattern matching
Expressions – contain text and special characters , used – to match patterns
in data
Re model – work with regular expressions
Provide support for – perl like regular expressions
Metacharacter:
^ - matches the beginning of the string
. - matches a single character, except a new line
[] - matches a single character from the ones inside it [xyz] matches x or
y or z
() - capture and group matched patterns
\t, \n, \r, \f - tab, newline, carriage return, and form feed respectively
* - matches preceding character zero or more times
{m,n} - matches preceding character minimum of “m” times and maximum
of “n” times
{m} - matches exactly m times
? - math zero or more
+ - eg: xy+z ( matches one or more time but not xz alone)
| - either or match
$ - match the end of the string
Functions :
Findall() - return all occurences
Search() - scans the entire string looking for first occurrence of pattern
Split()
Sub()
Subn()
Escape()
Match()
Finditer()
LIBRARIES:
NUMPY
- array processing module
Numerical python
Multi – dimensional array
It provides large no of numerical data type than core python
N dimensional array – ndarray
Array() – create ndarray object
Same size of block and same type in memory
PANDAS:
High performance open source python library used primarily for data analysis
and data manipulation
Used for dataframe object
- data from various file formats transformed into dataframes for further
processing in python
handle data preprocessing – both cleaning and data transformation
2 ds:
Series - 1d
Data frame – 2d
INDEXING IN DATA FRAMES:
2 types
Label based indexing with loc[] -- DataFrame.loc[]
Interger_based indexing – based on position of element in dataframe
--DataFrame.iloc[ ]