Python Basics III
29 Sep 2018
Chapter 3 - Python Basics III
So far, we have learned the data types used in python, assigning variables and ways to use conditional and loop statements.
In this chapter, we will discuss about functions, modules, class and how python work in “macro-level” approach
Functions
A function is a reusable block of code that you can call repeatedly to make calculations, output data, or really do anything that you want.
Simply speaking, you put something in, then you get something out.
This is one of the key aspects of using a programming language. To add to the built-in functions in Python, you can define your own!
The input (or inputs) goes inside the bracket, while the output is stated mostly in the end of the definition of the function.
Functions are defined with def, a function name, a list of parameters, and a colon. Everything indented below the colon will be included in the definition of the function.
However, the function does not always require an input or output
# Example 1
# No input
def hello_world():
""" Prints Hello World!"""
print("Hello World!")
hello_world()
for i in range(5):
hello_world()
Hello World!
Hello World!
Hello World!
Hello World!
Hello World!
These type of functions are used in the following way:
variable = function()
Function with no output
# Example 2
# No output
def sum(a,b):
print("the sum of {0} and {1} is {2}".format(a,b,a+b))
a = sum(3,4)
print(a) # Output can only be shown with "return" inside the function
These type of functions are used in the following way:
function(input1, input2 ….)
These type of functions are used in the following way:
function()
Function in General
as any variable defined purely within a function will not exist outside of it.
def see_the_scope():
in_function_string = "I'm stuck in here!"
see_the_scope()
print (in_function_string) # This will not work
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-3-fa033ebcbb3e> in <module>()
4 see_the_scope()
5
----> 6 print (in_function_string) # This will not work
NameError: name 'in_function_string' is not defined
The scope of a variable is the part of a block of code where that variable is tied to a particular value. Functions in Python have an enclosed scope, making it so that variables defined within them can only be accessed directly within them.
If we pass those values to a return statement we can get them out of the function. This makes it so that the function call returns values so that you can store them in variables that have a greater scope.
In this case specifically,including a return statement allows us to keep the string value that we define in the function.
def free_the_scope():
in_function_string = "Anything you can do I can do better!"
return in_function_string
my_string = free_the_scope()
print (my_string)
Anything you can do I can do better!
# Example 2
def volume(length, width, depth):
""" Calculates the volumne of a rectangular prism"""
return length*width*depth
volume(4,7,5)
If we want to, we can define a function so that it takes an arbitrary number of parameters. We tell Python that we want this by using an asterisk (*).
def sum_values(*args):
sum_val = 0
for i in args:
sum_val += i
return sum_val
print(sum_values(1,2,3,4,5))
print(sum_values(1,2,5))
The time to use *args as a parameter for your function is when you do not know how many values may be passed to it, as in the case of our sum function. The asterisk in this case is the syntax that tells Python that you are going to pass an arbitrary number of parameters into your function. These parameters are stored in the form of a tuple.
def test_args(*args):
print (type(args))
test_args(1, 2, 3, 4, 5, 6)
Function only has one return value
def sum_and_mul(a,b):
return a+b, a*b
def sum_and_mul_fake(a,b):
return a+b
return a*b
result = sum_and_mul(2,5)
print(result) # The result is 1 tuple value
result2 = sum_and_mul_fake(2,5)
print(result2) # only returns 1 result
Most of the programs and software that we use today, requires user input. Login, password are representative examples. In this part, we will learn how to adopt user input with python
a = input() # enter what you want to say!
input("Please enter your favorite color")
Please enter your favorite colorit is black
'it is black'
Printing results in a single line
for i in range(0,10):
print(i)
for i in range(0,10):
print(i, end=" ")
for i in range(0,10):
print(i, end="and")
0and1and2and3and4and5and6and7and8and9and
Printing Strings
print("I" "love" "you" "so" "much")
print("I"+"love"+"you"+"so"+"much")
# Comma provides space
print("I " "love " "you " "so " "much ")
print("I","love","you","so","much")
# Enter (line change)
print("H""\n""i")
# Tab
print("this","is","\t","tab")
Iloveyousomuch
Iloveyousomuch
I love you so much
I love you so much
H
i
this is tab
Class
This picture above explains the definition of class in the best way.
Class is the stamp that shapes the cookies (objects/variables). They allow programmers to create variables with same characteristics defined by the class.
Functions : they do specific things
Classes: they are specific things
Let’s look at the examples and see what exactly Class is
# Example 0.5
class Calculator:
pass
a = Calculator()
type(a)
# Example 1
class Calculator:
def data(self, first, second):
self.first = first
self.second = second
def sum(self):
result = self.first + self.second
return result
def sub(self):
result = self.first - self.second
return result
def mul(self):
result = self.first * self.second
return result
def div(self):
result = self.first / self.second
return result
a = Calculator()
a.data(2,5)
print(a.sum(), a.sub(), a.mul(), a.div())
# Example 2
# Making a cookie stamp
class Employee:
"""Common basic information of all employees"""
empNum = 0
def __init__(self, name, salary):
self.name = name
self.salary = salary
Employee.empNum += 1
def show_empNum(self):
print ("Total Employee Number is {0}".format(Employee.empNum))
def show_empInfo(self):
print ("Name: ", self.name, ", Salary: ", self.salary)
# Stamping cookies
emp1 = Employee("Jake","3000")
emp2 = Employee("Mina","2000")
emp3 = Employee("John","1000")
emp4 = Employee("Mike","500")
# Using functions defined in class
emp1.show_empInfo()
emp2.show_empInfo()
print("\n")
emp3.show_empNum()
emp4.show_empNum()
Name: Jake , Salary: 3000
Name: Mina , Salary: 2000
Total Employee Number is 4
Total Employee Number is 4
print("Total Employee Number = " + str(Employee.empNum))
Total Employee Number = 4
Modules
Module is a file that contains classes, variables or functions. It is a python file that can be “imported” and used. Python programmers can share modules and use them in their own code by importing the modules.
from IPython.display import IFrame
# Importing Iframe function from Ipython display file
IFrame("http://comic.naver.com", width=800, height=300)
<iframe
width="800"
height="300"
src="http://comic.naver.com"
frameborder="0"
allowfullscreen
></iframe>
Installing Modules
Every computer has a path, or directory. It works like an address.
“C:/Users/username/Desktop” –> this is probably your 바탕화면 address
python would also exist in a folder somewhere in your computer. If your path has been set properly, you can call pip to install required python modules.
In the command prompt, you can type
pip3 install module_name or python3 -m pip install module_name
Depending on your environment variable, calling your pip can be slightly different.
Python Basics II
29 Sep 2018
Chapter 2 - Python Basics II
In the previous chapter, we learned about different data types. In comparison to learning English, we finished our alphabets. This chapter will cover how to use data types. It is time to learn about Pythonic grammar!
Variables
We have already learned about assigning variables with different data types. In this section, more advanced approach of assigning variables will be shown
# You can assign variables like this
a,b = ("Python", "Grammar")
print(a)
print(b)
# Remember, that tuple can skip brackets!
c,d = "Maplestory", "Crazy Arcade"
print(c)
print(d)
Python
Grammar
Maplestory
Crazy Arcade
e = f = "python"
print(e)
print(f)
[g,h] = ["Coding","is fun"]
print(g)
print(h)
You can switch or reassign values like this
a = 3
b = 5
# Switch!
a,b = b,a
print(a)
print(b)
a = 4
# reassigning variable a
a = a+1
print(a)
Control Statements
Control and Loop statements are the basic structure of coding. They are the algorithm and flow of the code execution. Codes are executed line by line, one at a time from top to bottom. The sequence of statements that are executed is referred to as the flow of control. The flow of control can be changed with the control statements: conditional and loop statements.
conditional
loop
When using these control statements, Indentation and : is very very very important. Conditional and Loop statements are frequently used together
Conditional Statement - if
conditional statements are statements that may or may not be executed depending on certain conditions that the code specifies. The condition part is where Boolean data types and logical operators are used frequently
Example)
Let’s assume that there is a movie rated R, which means that the movie can only be watched by people restricted to the age 18 and over. We have to screen out people with appropriate ages. Your cousin is age 14. Let’s try this out using if statement
age = 14
if age < 18:
print("You are not allowed to watch this movie")
else:
print("Enjoy the movie!")
You are not allowed to watch this movie
Structure
We can create segments of code that only execute if a set of conditions is met. We use if-statements in conjunction with logical statements in order to create branches in our code.
An if block gets entered when the condition is considered to be True. If condition is evaluated as False, the if block will simply be skipped unless there is an else block to accompany it. Conditions are made using either logical operators or by using the truthiness of values in Python. An if-statement is defined with a colon and a block of indented text.
# This is the basic format of an if statement. This is a vacuous example.
if "Condition1":
# This block of code will execute because the string is non-empty
# Everything on these indented lines
print (True)
else:
# So if the condition that we examined with if is in fact False
# This block of code will execute INSTEAD of the first block of code
# Everything on these indented lines
print (False)
# If you want more than two control line, you should use "elif"
if "Condition":
# Everything on these indented lines
print (True)
elif "Condition2":
# This block of code will execute only if the previous statement is false
# Everything on these indented lines
print("What?")
elif "Condition3":
# Everything on these indented lines
print("What??")
else:
# This block of code will execute INSTEAD of the first and the second block of code
# Everything on these indented lines
print (False)
Ways to write condition statements
various expressions |
x < y , x > y |
x == y, x != y |
x >= y, x <= y |
x or y |
x and y |
not x |
x in list , x not in list |
x in tuple , x not in tuple |
x in string , x not in string |
# Example 1
money = 3000
feeling = "bad"
if money < 3000:
print("Walk")
elif money >= 3000 and feeling == "good":
print("Take Bus")
else:
print("Take Taxi")
# Example 2
menu = {"ramen":["flour","egg","pork","onion"],
"spaghetti":["flour","cream","tomato","beef","onion"],
"kimbap":["rice","carrot","spam","egg","spinach","pickle"]}
if "flour" not in menu["ramen"]:
print("I will eat ramen!!!")
elif "kimbap" in menu.keys():
print("Give me donkatsu")
else:
print("spaghetti for me please")
# Example 3
i = 10
if i % 2 == 0:
print("{0} is even number!!!".format(i))
if i % 3 == 0:
print ('{0} is divisible by both 2 and 3! Wow!'.format(i))
elif i % 5 == 0:
print ('{0} is divisible by both 2 and 5! Wow!'.format(i))
else:
print ('{0} is divisible by 2, but not 3 or 5. Meh.'.format(i))
else:
print ('{0} is an odd number.'.format(i))
10 is even number!!!
10 is divisible by both 2 and 5! Wow!
Loop Statement - For
For is a loop statement used for going through values or iterate over any sequences (tuple, list, string).
# Example 1
for i in range(0,5):
print (i)
# Example 2
for i in range(5):
print (i)
# Example 3
a = [(1,2),(3,4),(5,6)]
for (x,y) in a:
print(x+y)
# Example 4
result = []
for i in range(2,5):
for j in range(3,6):
result.append(i*j)
print(result)
Stopping the Loop
for i in range(5):
if i == 2:
break
print (i)
Loop Statement - While
with While loops we need to make sure that something actually changes from iteration to iteration so that that the loop actually terminates.
If not, the infinite loop will be created, and your code execution will not stop. In this case, we use the shorthand i -= 1 (short for i = i - 1) so that the value of i gets smaller with each iteration. Eventually i will be reduced to 0, rendering the condition False and exiting the loop. Or, you can use break or use conditional statements to stop the loop
# Example 1
treeHit = 0
while treeHit < 10:
treeHit = treeHit +1
print("Tree hit %d times." % treeHit)
if treeHit == 10:
print("Tree falls.")
Tree hit 1 times.
Tree hit 2 times.
Tree hit 3 times.
Tree hit 4 times.
Tree hit 5 times.
Tree hit 6 times.
Tree hit 7 times.
Tree hit 8 times.
Tree hit 9 times.
Tree hit 10 times.
Tree falls.
# Example 2
i = 10
while True:
if i == 14:
break
i += 1 # This is shorthand for i = i + 1.
print( i)
# Example 3
Stamina = 50
apple = 3
while Stamina>0:
Stamina -= 5
while Stamina < 10 & apple > 0:
print("Eat apple")
Stamina += 10
apple -= 1
print("Remaning apple is {0}".format(apple))
print("Current Stamina is {0}".format(Stamina))
if Stamina == 0:
print("You are Dead")
else:
pass
Current Stamina is 45
Current Stamina is 40
Current Stamina is 35
Current Stamina is 30
Current Stamina is 25
Current Stamina is 20
Current Stamina is 15
Current Stamina is 10
Current Stamina is 5
Eat apple
Remaning apple is 2
Current Stamina is 10
Current Stamina is 5
Eat apple
Remaning apple is 1
Current Stamina is 10
Current Stamina is 5
Current Stamina is 0
You are Dead
Break, Continue, Pass
Break, Continue, Pass statements are used to work with loop and conditions more efficiently and effectively.
Break statements provide the opportunity to stop and exit the loop completely when external condition is triggered.
number = 0
for number in range(10):
number = number + 1
if number == 5: # external condition
break # break here
print('Number is ' + str(number))
print('Out of loop')
Number is 1
Number is 2
Number is 3
Number is 4
Out of loop
Continue statements allows you to skip over a part of loop when external condition is triggered, but go on to complete the rest of the loop.
number = 0
for number in range(10):
number = number + 1
if number == 5: # external condition
continue # break here
print('Number is ' + str(number)) # Number is 5 will not be printed
print('Out of loop')
Number is 1
Number is 2
Number is 3
Number is 4
Number is 6
Number is 7
Number is 8
Number is 9
Number is 10
Out of loop
Pass statement allows to handle the condition without the loop being impacted in any way. |
number = 0
for number in range(10):
number = number + 1
if number == 5: # external condition
pass # break here
print('Number is ' + str(number))
print('Out of loop')
Number is 1
Number is 2
Number is 3
Number is 4
Number is 5
Number is 6
Number is 7
Number is 8
Number is 9
Number is 10
Out of loop
Python Basics I
28 Sep 2018
Chapter 1 - Python Basics I
Python is an intuitive language
Python is very intuitive, compared to other programming languages. This is why people say that it is easy to learn for beginners.
Let’s look at the example below
x = 3
if x ==3: print("correct")
Just by reading this code, most people can catch what this code is saying.
The first line says that “x” equals to 3
The second line says that if “x” equals to 3, print the word “correct”
Pretty easy eh?
Code comment is used for explanation from developers. They are ignored during code execution. Code comments can be made after a hashtag. For more multiple code comments, use of three quotations are used. Below is an example
# This is a comment
# The comment section is ignored during the code execution
The multiline comments do not perform anythin,
"""
This is a multi-line comment. It is a special type of string called, Doc-string.
Normally, they are used for detailed explanation of codes, or long comments
"""
'\nThis is a multi-line comment. It is a special type of string called, Doc-string.\nNormally, they are used for detailed explanation of codes, or long comments\n'
Data Types
When you learn English, alphabet is the fundamental unit of knowledge. Python is also a programming “language”. Importance of data types are equivalent to knowing the alphabets.
These are main data types of python:
- Numbers
- Strings
- Boolean
- Lists
- Tuples
- Dictionary
- Set
Let’s cover these data types one by one
Numbers
Numbers include integers, floating numbers and complex numbers. If you have done your mathematics, these terms should be familiar. Also, there are octal and hexadecimal numbers which are not frequently used in general python programming.
- integers –> int
- floating number –> float
- complex number –> complex
Let’s look at them one by one
Integer
# Integer
## Integer is a real number.
#It is a round number with no values after decimal point
x = 50
print(x) # Simple line that orders to show (print) x
type(x) # This shows what the data type of x is
Variables are assigned by using a single equation sign “=”. They are case sensitive, but using similar variables are not recommended in a complex coding, due to confusion and readability
one = 3
One = 6
print(one, One)
Float
# Floating Point
# Floating point is a name for a real number.
# It may have numbers after the decimal point
y = 1.4
z = 4.0
print(y)
print(z)
# you can use int() function to make float to integer
print(int(y))
print(int(z))
#You can use float() function to force a number to be considered as a float
my_int = 3
print(type(my_int))
my_int2 = float(3)
print(type(my_int2))
1.4
4.0
1
4
<class 'int'>
<class 'float'>
Complex
# Complex
# Complex is a complex number.
# It is a number that can be expressed in the form of a + bi
# a, b are real numbers while i is the solution of x^2= -1
x = complex(2,4)
y = complex(1,5)
print(x,y, x + y)
Basic Math
Python has a number of built-in math functions. These can be extended even further by importing the math package or by including any number of other calculation-based packages.
All of the basic arithmetic operations are supported: +, -, /, and . You can create exponents by using * and modular arithmetic is introduced with the mod operator, %.
print ('Addition: ', 2 + 2)
print ('Subtraction: ', 7 - 4)
print ('Multiplication: ', 2 * 5)
print ('Division: ', 10 / 2)
print ('Exponentiation: ', 3**2)
print ('Modulo: ', 15 % 4) # remainder
Addition: 4
Subtraction: 3
Multiplication: 10
Division: 5.0
Exponentiation: 9
Modulo: 3
This also works on variables
first_integer = 4
second_integer = 5
print (first_integer * second_integer)
x = 3
x - 2
first_integer = 11
second_integer = 3
print (first_integer / second_integer)
Python’s built-in math functions
- abs()
- round()
- max()
- min()
- sum()
These functions all act as you would expect, given their names. Calling abs() on a number will return its absolute value. The round() function will round a number to a specified number of the decimal points (the default is 00 ). Calling max() or min() on a collection of numbers will return, respectively, the maximum or minimum value in the collection. Calling sum() on a collection of numbers will add them all up. If you’re not familiar with how collections of values in Python work, don’t worry! We will cover collections in-depth in the next section.
Strings
string is a data type that lets you include text as a variable. They are defined using single quotes or double quotes
Both are allowed so that we can include apostrophes or quotation marks in a string if we so choose.
You may use \ to include apostrophes or quotation marks into the string
say_hello = "Hello. My name is python"
print(say_hello)
say_bye = 'Bye. It was nice to meet you'
print(say_bye)
my_string = '"Jabberwocky", by Lewis Carroll'
print (my_string)
my_string = "'Twas brillig, and the slithy toves / Did gyre and gimble in the wabe;"
print (my_string)
alt_string = 'He said, \'this is ridiculous.\''
print(alt_string)
Hello. My name is python
Bye. It was nice to meet you
"Jabberwocky", by Lewis Carroll
'Twas brillig, and the slithy toves / Did gyre and gimble in the wabe;
He said, 'this is ridiculous.'
Calculation in Strings
# Addition
first_name = "Harry"
last_name = "Potter"
first_name + last_name
# Multiplication
twice = "likey"
twice*2
# Usage
print("="*40)
print("Oh Yeah")
print("="*40)
========================================
Oh Yeah
========================================
Indexing and Slicing
sentence = "wingardium leviosa"
# Python starts from 0
print(sentence[4])
print(sentence[0])
print(sentence[-2])
# Usage
print( sentence[11] + sentence[15] + sentence[13] + sentence[12])
# sentence[0:4] --> 0<= sentence < 4
print( sentence[0:4])
# Changing string by indexing
a = "League of Leg nd"
print(a)
# You cannot change string or tuple. These are called imutable data types
a[-3] = "e"
print(a)
League of Leg nd
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-56-e0f9681d392f> in <module>()
4
5 # You cannot change string or tuple. These are called imutable data types
----> 6 a[-3] = "e"
7 print(a)
TypeError: 'str' object does not support item assignment
correct = a[:7] + a[7:-3] + "e" + a[-2:]
print(correct)
Using the format() method we can add in variable values and generally format our strings.
my_string = "{0} {1}".format('Marco', 'Polo')
print( my_string)
my_string = "{1} {0}".format('Marco', 'Polo')
print (my_string)
We use braces ({}) to indicate parts of the string that will be filled in later and we use the arguments of the format() function to provide the values to substitute. The numbers within the braces indicate the index of the value in the format() arguments
print ('insert %s here' % 'value')
The % symbol basically cues Python to create a placeholder. Whatever character follows the % (in the string) indicates what sort of type the value put into the placeholder will have. This character is called a conversion type. Once the string has been closed, we need another % that will be followed by the values to insert. In the case of one value, you can just put it there. If you are inserting more than one value, they must be enclosed in a tuple.
print ('There are %s cats in my %s' % (13, 'apartment'))
There are 13 cats in my apartment
- count()
- find()
- index()
- join()
- upper() / lower()
- lstrip() / rstrip()
- replace()
- split()
# Returns the number of character in the string
a = "Python is fun"
a.count("n")
# Returns the value of first position. If does not exist, return -1
print(a.find("n"))
print(a.find("k"))
# Returns the index number
print(a.index("s"))
# Insert string a between 1 and 2
print(a.join("12"))
# Change the characters to uppercase or lowercase
print(a.upper(), a.lower())
PYTHON IS FUN python is fun
# Deleting spaces in the left of right side of the string
b = " I have lots of space "
print(b.lstrip())
print(b.rstrip())
# Deleting spaces in both side of string
print(b.strip())
I have lots of space
I have lots of space
I have lots of space
# Replace
a.replace("fun","horrible")
# Slicing
print(a.split())
# Another example
k = "I/do/not/like/your/shirt"
print(k.split("/"))
print(k.split("t"))
['Python', 'is', 'fun']
['I', 'do', 'not', 'like', 'your', 'shirt']
['I/do/no', '/like/your/shir', '']
Encoding
Normal strings in python are stored internally as 8-bit (= 1byte) ASCII. On the other hand, unicode uses 16-bit system (= 2byte) and it can store a more varied set of characters.
Chinese characters and Korean characters cannot be represented in ASCII code.
UTF-8 strings are stored as 8 bit (= 1byte) just like ASCII, but it works as an extension of ASCII. It can express Korean also.
Frequent errors that you can encounter
a = "안녕"
b = a.encode("utf-8")
print(b)
c = b.decode("utf-8")
print(a)
b'\xec\x95\x88\xeb\x85\x95'
안녕
str – encode () –> bytes – decode() –> str
Boolean
Boolean is related to logical operators, and takes one of two values only: True or False (True = 1, False = 0)
Default Boolean values follows:
Values |
True of False |
“python” |
True |
”” |
False |
[] |
False |
{} |
False |
() |
False |
0 |
False |
1 |
True |
[1,2,3] |
True |
None |
False |
Mostly, empty list,tuple,dictionaries are False. If not, they are True. In terms of numbers, 0 is False while 1 is true.
and (&) , or ( | ), not
# And
print(True & True) # True and True
print(True & False) # True and False
print(False & False) # False and False
print("="*30)
# Or
print(True | True) # True or True
print(True | False) # True or False
print(False | False) # False or False
print("="*30)
# Not
print(not True)
print(not False)
True
False
False
==============================
True
True
False
==============================
False
True
Lists
List is an ordered collection of objects that can contain any data type. List can be defined using brackets [ ]
eiffel_tower = ["Paris","France","1889",300.65, 324]
We can access and index the list by using brackets as well. In order to select an individual element, simply type the list name followed by the index of the item you are looking for in braces.
Indexing in Python starts from 00 . If you have a list of length nn , the first element of the list is at index 00 , the second element is at index 11 , and so on and so forth. The final element of the list will be at index n−1n−1 . Be careful! Trying to access a non-existent index will cause an error.
We can see the number of elements in a list by calling the len() function
we can update and change list by accessing an index. List is mutable unlike strings meaning you can change them. However, once a string or other immutable data type has been created, it cannot be directly modified without creating an entirely new object.
print(eiffel_tower[4])
eiffel_tower[4] = 320
print(eiffel_tower[4])
If you want to put two lists together, they can be combined with a + symbol.
# You can put list in a list!
list1 = ["i", 3, 4.30, [1,2,"harry"], "you"]
list2 = ["hello",4.5, 3.0]
list3 = list1 + list2
print(list3)
['i', 3, 4.3, [1, 2, 'harry'], 'you', 'hello', 4.5, 3.0]
Indexing and Slicing
Lists have similar mechanism of indexing and slicing just as strings do
starwars = ["may","the","Force","be","with","you"]
print(starwars[2:4])
print(starwars[1:])
print(starwars[:3])
['Force', 'be']
['the', 'Force', 'be', 'with', 'you']
['may', 'the', 'Force']
['may', 'the', 'Force', 'be', 'with', 'you']
You can also add a third component to slicing. Instead of simply indicating the first and final parts of your slice, you can specify the step size that you want to take. So instead of taking every single element, you can take every other element.
confusing = ["may",1,"the",2,"Force",3,"be",4,"with",5,"you"]
print(confusing[0:10:2])
print(confusing[::2])
print(confusing[::-1])
print(confusing[::-2])
['may', 'the', 'Force', 'be', 'with']
['may', 'the', 'Force', 'be', 'with', 'you']
['you', 5, 'with', 4, 'be', 3, 'Force', 2, 'the', 1, 'may']
['you', 'with', 'be', 'Force', 'the', 'may']
Functions related with lists
- append
- sort
- reverse
- index
- insert
- remove
- pop
- count
- extend
# append
a = [1,2,3]
a.append(4)
a
# sort
a = [2,5,3,4,7,1,3]
a.sort()
print(a)
b = ["c","d","b","w","f","r"]
b.sort()
print(b)
[1, 2, 3, 3, 4, 5, 7]
['b', 'c', 'd', 'f', 'r', 'w']
c = [6,4,8,"e",3,"b","c"]
# This does not work
c.sort()
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-122-ff5aa901bedc> in <module>()
1 c = [6,4,8,"e",3,"b","c"]
2 # This does not work
----> 3 c.sort()
TypeError: unorderable types: str() < int()
a = ["a","c","b"]
a.reverse()
a
a = [1,5,3,7]
a.insert(3,6)
a
a = [1,2,3,4,5,6,7]
a.remove(5)
a
a = [1,4,7,3,5,6,"d"]
a.count("d")
a = [1,2,3,4,5]
a.extend(["d","5","d",5])
a
[1, 2, 3, 4, 5, 'd', '5', 'd', 5]
Tuple
Tuple is similar to list.
The difference between tuple and list is that tuple is immutable.
The data cannot be changed once it has been generated.
t1 = ()
# When tuple only has one value, comma must be follwed
t2 = (1,)
t3 = (1)
print(type(t2), type(t3))
# When assigning variable with tuple, brackets can be skipped
t4 = (1,2,3)
t5 = 1,2,3
print(type(t4), type(t5))
<class 'tuple'> <class 'int'>
<class 'tuple'> <class 'tuple'>
Tuple indexing and slicing is similar to list
tup = (1,4,7,4,"d","d2",2,"4")
print(tup[4])
print(tup[2:])
d
(7, 4, 'd', 'd2', 2, '4')
Tuple addition and multiplication
tuple1 = (1,2,5)
tuple2 = (4,6,7)
tuple3 = tuple1 + tuple2
print(tuple3)
print(tuple2*4)
(1, 2, 5, 4, 6, 7)
(4, 6, 7, 4, 6, 7, 4, 6, 7, 4, 6, 7)
Dictionary
Dictionary is an important data structure in python. Dictionaries are defined with a combination of curly braces {} and colons : The braces define the beginning and end of a dictionary and the colons indicate key-value pairs. A dictionary is essentially a set of key-value pairs. The key of any entry must be an immutable data type. This makes both strings and tuples candidates. Keys can be both added and deleted.
In the following example, we have a dictionary composed of key-value pairs where the key is a genre of fiction (string) and the value is a list of books (list) within that genre. Since a collection is still considered a single entity, we can use one to collect multiple variables or values into one key-value pair.
Dictionary Example
Key |
Value |
name |
Jake |
phone |
01055999411 |
birth |
19941125 |
Personal_Info = {"name":"Jake",
"phone":"01055999411",
"birth":"19941125"}
Dictionary Usage & Manipulation
Idols = {"Twice": ["Jihyo","Nayeon","Jungyeon","Momo","Sana","Mina","Dahyun","Chaeyoung","Tzuyu"],
"Blackpink": ["Jisu","Jennie","Rose","Lisa"],
"Bigbang": ["GD","TOP","Taeyang","Daesung","Seungri"],
"BTS": ["RM","Suga","Jin","J-Hope","Jimin","V","Jungkook"]}
After defining a dictionary, we can access any values by indicating its key in brackets
# Using Key to retrieve values
print(Idols["BTS"])
print(Idols["Twice"])
['RM', 'Suga', 'Jin', 'J-Hope', 'Jimin', 'V', 'Jungkook']
['Jihyo', 'Nayeon', 'Jungyeon', 'Momo', 'Sana', 'Mina', 'Dahyun', 'Chaeyoung', 'Tzuyu']
Dictionary can be changed by the value associated with a given key
Idols["Bigbang"] = ["GD","Top","Daesung","Taeyang"]
print(Idols["Bigbang"])
['GD', 'Top', 'Daesung', 'Taeyang']
Cautions when making dictionary
Dictionary Key is a unique value.
This means that same key cannot be used again.
Also, list cannot be used in the key. This will cause error
# key cannot be used again (there are two "5"s in the key)
xdict = {5:"a", 5:"b"}
xdict # only returns 5:"b"
- keys
- values
- items
- clear
- get
- in
# Retrieve only "keys" from the dictionary
Idols.keys()
dict_keys(['BTS', 'Twice', 'Bigbang', 'Blackpink'])
# Retrieve only "values" from the dictionary
Idols.values()
dict_values([['RM', 'Suga', 'Jin', 'J-Hope', 'Jimin', 'V', 'Jungkook'], ['Jihyo', 'Nayeon', 'Jungyeon', 'Momo', 'Sana', 'Mina', 'Dahyun', 'Chaeyoung', 'Tzuyu'], ['GD', 'Top', 'Daesung', 'Taeyang'], ['Jisu', 'Jennie', 'Rose', 'Lisa']])
# Retrieve "keys" and "values" pairs from the dictionary
Idols.items()
dict_items([('BTS', ['RM', 'Suga', 'Jin', 'J-Hope', 'Jimin', 'V', 'Jungkook']), ('Twice', ['Jihyo', 'Nayeon', 'Jungyeon', 'Momo', 'Sana', 'Mina', 'Dahyun', 'Chaeyoung', 'Tzuyu']), ('Bigbang', ['GD', 'Top', 'Daesung', 'Taeyang']), ('Blackpink', ['Jisu', 'Jennie', 'Rose', 'Lisa'])])
# Delete everything in the dictionary
Personal_Info.clear()
Personal_Info
# Getting values with key
Idols.get("Blackpink")
['Jisu', 'Jennie', 'Rose', 'Lisa']
# Checking if key is in dictionary
"BtoB" in Idols
Set
Set is made with set() function. Set does not allow repetitive values, and it is unordered.
s1 = set([1,2,3])
s2 = set("This is set")
print(s1, s2)
{1, 2, 3} {'i', 'h', 'T', 't', 's', ' ', 'e'}
Because set does not have orders, it cannot be indexed nor sliced.
In order to perform this, they have to be converted to tuples or lists.
l2 = list(s2)
print(l2)
print(l2[4])
['i', 'h', 'T', 't', 's', ' ', 'e']
s
Set is useful when calculating intersection and union
a = set([1,2,5,7,8,9])
b = set([2,6,3,8,4])
# Intersection
print(a & b)
print(a.intersection(b))
# union
print( a | b)
print(a.union(b))
# subtraction
print( a - b)
print( a.difference(b))
{8, 2}
{8, 2}
{1, 2, 3, 4, 5, 6, 7, 8, 9}
{1, 2, 3, 4, 5, 6, 7, 8, 9}
{1, 5, 9, 7}
{1, 5, 9, 7}
Set related functions
# Adding a value
s1 = set([1,4,6,7])
s1.add(9)
print(s1)
# Adding multiple values
s2 = set([5,8,3])
s2.update([1,4,5,6])
print(s2)
# Removing a value
s3 = set([2,5,8,1])
s3.remove(2)
print(s3)
Machine Learning Basic
27 Sep 2018
Machine Learning
Machine Learning is a field of computer science that gives computers the ability to learn about being explicitly programmed.”
-Arthur Samuel-
Well-posed Learning Problem: A computer program is said to learn from experience E, with respect to some task T and some performance measure P, if its performance on T as measured by P, improves with experience E
-Tom Mitchell -
- Machine Learning is providing systems the ability to automatically learn and improve from experience (data)
- It is a field of artificial intelligence research method
- The concept is relevant to how humans learn and perceive matters
- Most of its methods can be summarized or simplified into finding patterns from sample data and classifying them
- It can be used for prediction, or finding meaningful insights ( just like humans learn from their past experience)
Application of Machine Learning
Types of Machine Learning
The three most distinctive types of machine learning are:
- Supervised Learning (Classification)
- Unsupervised Learning (Clustering)
- Reinforcement Learning
Supervised Learning
Supervised Learning trains model or computer with sample data that has labels (or answers). After the labeled data is trained, another data is used to predict label values (or correct answers).
For example, supervised learning provides data like above.
The data includes hand-written numbers. With supervised learning, you can train computer with this dataset, and test them with your handwritten numbers whether the computer can recognize them correctly(predicting labels).
Algorithms used in Supervised Learning
- Support Vector Machine (SVM)
- Hidden Markov Model
- Regression
- Neural Network
- Naive Bayes Classification
- KNN Classification
…..
Unsupervised Learning
Unsupervised Learning on the other hand, is learning method that does not have any labels or answers. It is used to search for the existence of pattern or clusters in the data.
Just as the example above, unsupervised learning is used to find patterns, groups, clusters from data with no specific labels or information
Algorithms used in Unsupervised Learning
- K-means Clustering
- Hierarchical Clustering
- Mean-Shift Clustering
….
Difference between Supervised Learning and Unsupervised Learning
The key difference is in the “labeling”. For supervised learning, we provide dataset with labels or answers. On the other hand, the unsupervised learning does not have labels. The computer automatically split or generate groups (clusters) in accordance to the data features.
<img src=”/assets/images/sup_vs_unsup.jpg>
Reinforcement Learning
Reinforcement Learning is quite different from supervised and unsupervised learning. It includes a unique term “reward”. When the subject (agent) analyses(observation) data and make decisions (actions), it gets reward. Then, the subject will make optimal decision that can maximize its reward.
Machine Learning with Python (Iris Dataset)
Python has great libraries and modules that support machine learning.
Scikit-Learn a.k.a sklearn is the most representative module. Sklearn also provides basic dataset so that you can practice machine learning
# importing sklearn in python
import sklearn
Let’s try to solve an example with machine learning
The Iris Dataset
Flower iris has three species: Versicolor, Setosa, Virginica.
The dataset includes 150 different flower data of 4 numeric attributes: sepal length, sepal width, petal length, petal width.
The object is to predict which species group a flower belongs, by building a model with machine learning.
from sklearn import datasets
iris_data = datasets.load_iris()
iris_data
# ['sepal length (cm)','sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
data = iris_data.data
# The Labels: 0 = setosa, 1 = versicolor, 2 = virginica
label = iris_data.target
print(label)
Supervised Learning
In this case, the labels (or answers) would be species name.
KNN-algorithm
knn(k nearest neighbor) algorithm is probably the simplest algorithm in machine learning along with k-means clustering in unsupervised learning.
The algorithm is pretty simple. If k = 3, KNN is deciding a label of new data based on the labels of 3 closest neighbors.
For example, the green circle (data with no label) will be classified as red triangle if k is 3. However, if k = 5, the green circle will be classified as blue square.
Importing KNN library
# KNN model import
from sklearn.neighbors import KNeighborsClassifier
# We are going to split the whole dataset into two groups:
## Train group: training the model
## Test group: testing the trained model and see its accuracy
### train_test_split allows easy split of the data into train and test groups
from sklearn.model_selection import train_test_split
Splitting Data to train and test set
# Let's set K to 3 ( k = 3)
knnmodel = KNeighborsClassifier(n_neighbors= 3)
# Split the data into train and test
X_train, X_test, y_train, y_test = train_test_split(data,label, test_size=0.3)
# Train data = 70% of whole data
# Test data = 30% of whole data
Training Model
# Train the model
## Your Machine(Model) learns here!!
knnmodel.fit(X_train, y_train)
Prediction and Score
# Let's see how well your machine predicts
## Predicted results from your machine(model)
prediction = knnmodel.predict(X_test)
## The answers
y_test
## Score & Feedback
from sklearn.metrics import accuracy_score
my_score = accuracy_score(y_true= y_test, y_pred= prediction)
print(my_score)
Unsupervised Learning
Let’s try unsupervised learning this time.
Let’s assume that we do not know about the labels or the fact that iris is divided into three species. We will use K-means clustering algorithm to find how the flowers can be grouped.
K-means Clustering
K Means Clustering tries to cluster data based on their similarity. In k means clustering, we have the specify the number of clusters we want the data to be grouped into. The algorithm randomly assigns each observation to a cluster, and finds the centroid of each cluster. Then, the algorithm iterates through two steps:
- Reassign data points to the cluster whose centroid is closest.
- Calculate new centroid of each cluster.
These two steps are repeated until the within cluster variation cannot be reduced any further. The within cluster variation is calculated as the sum of the euclidean distance between the data points and their respective cluster centroids.
K-means library
# Import KMeans Clustering module
from sklearn.cluster import KMeans
Model fitting
# We know that the iris has 3 species
kmeansmodel = KMeans(n_clusters= 3)
# Model fitting
kmeansmodel.fit(data)
# Observing
# Respective cluster centroids
##### [Cluster0
#### Cluster1,
#### Cluster2]
centroids = kmeansmodel.cluster_centers_
print(centroids)
Results
# Let's see how our model clustered the data
print(kmeansmodel.labels_)
print("\n")
print(label)
# REMEMBER! KMeans is not classification.
Visualizing the clusters
We are going to use PCA(Principal Component Analysis), to fit our data into 2D shape. PCA is a statistical procedure that uses an orthogonal transformation to reduce dimensions.
For example, our iris data has 4 features. We cannot plot them in our graph. (Assuming that 1 variable = 1 axis, 3 variable is the maximum amount that we can plot into our world). So, we are going to reduce our data into 2 dimensional for easy plotting.
Using plotly
# plotly is another visualization library just like matplotlib
import plotly.graph_objs as go
from plotly.offline import init_notebook_mode, iplot
init_notebook_mode()
PCA to reduce dimension for plotting
from sklearn.decomposition import PCA
import numpy as np
pca = PCA(n_components= 2)
# Using x as 1st column feature("sepal length") , y as 2nd column feature ("sepal width")
x = pca.fit_transform(data)[:,0]
y = pca.fit_transform(data)[:,1]
# Add PCA x,y values to the dataset
X = x.reshape(150,1)
Y = y.reshape(150,1)
Original Data
cluster0 = go.Scatter( x = np.reshape(X[label==0],50,),
y = np.reshape(Y[label==0],50,),
name = "Cluster0",
mode = "markers",
marker = dict( size = 10, color = "rgba(15,152,152,0.5)",
line = dict(width = 1, color = "rgb(0,0,0)")))
cluster1 = go.Scatter( x = np.reshape(X[label==1],50,),
y = np.reshape(Y[label==1],50,),
name = "Cluster1",
mode = "markers",
marker = dict( size = 10, color = "rgba(180,18,180,0.5)",
line = dict(width = 1, color = "rgb(0,0,0)")))
cluster2 = go.Scatter( x = np.reshape(X[label==2],50,),
y = np.reshape(Y[label==2],50,),
name = "Cluster2",
mode = "markers",
marker = dict( size = 10, color = "rgba(132,132,132,0.8)",
line = dict(width = 1, color = "rgb(0,0,0)")))
real_dat = [cluster0, cluster1, cluster2]
layout_real = go.Layout(
title = "Actual Groups")
real = go.Figure(data=real_dat, layout = layout_real)
Clustered Data
cluster0 = go.Scatter( x = np.reshape(X[kmeansmodel.labels_==0],50,),
y = np.reshape(Y[kmeansmodel.labels_==0],50,),
name = "Cluster0",
mode = "markers",
marker = dict( size = 10, color = "rgba(15,152,152,0.5)",
line = dict(width = 1, color = "rgb(0,0,0)")))
cluster1 = go.Scatter( x = np.reshape(X[kmeansmodel.labels_==1],38,),
y = np.reshape(Y[kmeansmodel.labels_==1],38,),
name = "Cluster1",
mode = "markers",
marker = dict( size = 10, color = "rgba(132,132,132,0.8)",
line = dict(width = 1, color = "rgb(0,0,0)")))
cluster2 = go.Scatter( x = np.reshape(X[kmeansmodel.labels_==2],62,),
y = np.reshape(Y[kmeansmodel.labels_==2],62,),
name = "Cluster2",
mode = "markers",
marker = dict( size = 10, color = "rgba(180,18,180,0.5)",
line = dict(width = 1, color = "rgb(0,0,0)")))
cluster_dat = [cluster0, cluster1, cluster2]
layout_clus = go.Layout(
title = "Grouped by Cluster Model")
cluster_model = go.Figure(data=cluster_dat, layout = layout_clus)
Visualized graph (original data vs clustered data)
iplot(real)
iplot(cluster_model)
Nowgeun's Blog
26 Sep 2018
Welcome!! I am Jaekeun Lee, a student of Yonsei University majoring in Creative Technology Management and living in Seoul, Republic of Korea.
Who am I?
I like to solve problems, challenge difficulties and influence others.
You can check my CV here.
Field of Interests
I am interested, but not limited to, in the following fields:
- Artificial Intelligence
- Statistical Thinking
- Natural Language Processing
- Social Innovation
And of course anything that is challenging or stimulates my curiosity.