INTRODUCTION TO PYTHON FOR DATA ANALYSIS

Kiregi Paul - Oct 4 - - Dev Community

INTRODUCTION TO PYTHON FOR DATA ANALYSIS
Python is a high-level, interpreted programming language known for its readability and simplicity. Python was created by Guido Van Rossum and first released in 1991.
It emphasizes on code clarity which makes it a better choice for beginners and well-experienced developers because of its readability and simplicity.
Python is used in various fields such as data analysis, data science, web development, and automation.

WHY PYTHON FOR DATA ANALYSIS

  1. Rich Libraries: Python has a variety of libraries specifically designed for data analysis such as: pandas: For data manipiulation and analysis. Numpy: For numerical computing and handling arrays. Matplotlib and seaborn: for data visualization. Scipy: For scientific and technical computing. Scikit-learn: For machine learning.
  2. Community support: Python has a large and active community, where you can easily find resources, tutorials, and forums where you can get assistance.
  3. Integration: Python is easily integratable with other languages and technologies making it suitable for complex workflows.
  4. Flexibility: It is easy to handle when building machine learning models, creating visualizations and doing exploratory data analysis.

DATA TYPES
Python has several built-in data types, which can be categorized as:

  1. Numeric types

    int- integer values(1, 5, 7)
    float-floating point number that is decimals (2.4, 3.142)
    complex- complex numbers with a real number and imaginary part(4 + 2x)

  2. Sequence types

    str- string (sequence of characters)"hello world"
    list- mutable ordered sequence of elements [1, 2, 3] ["dog", "cats", "cows"]
    tuple- immutable ordered sequence of elements (1, 2, 3) ("dog", "cats", "cows")
    range- represents a range of numbers range(0, 10)
    dict- dictionary, a collection of key value pairs {"name:" "alice", "Age:" "30"}

  3. Set types
    set- unordered collection of unique pairs {1, 2, 3} {"dog", "cats"}
    bool- represents True and False

GETTING STARTED
Getting started with python you have to start with the installation process that is installing the neccessary tools, notebooks and virtual environments to work with.
Anacoda is the best distribution to work with as it comes with pre-installed libraries. After installation of anaconda launch the jupyter notebook which will be used to run the python codes.
After installation you can start with a simple code to help you familiarize yourself with python the syntax is as below:

# printing hello world
      print("hello world!")
Enter fullscreen mode Exit fullscreen mode

The output is:
hello world!

Remember hello world! is a string, therefore to output a string you have to use "quotation marks".
Here is a code to distinguish how to print a string and the output from a variable name;

#Printing fruits such as oranges bananas apples
    fruits = ("oranges", "bananas", "apples")
    print(fruits)
    print("fruits")
Enter fullscreen mode Exit fullscreen mode

There are two outputs from this code
oranges, bananas, apples
fruits

The first output fruit being a variable name and therefore outputs the data values stored in it.
The second output is the string fruits because of the use of the quotation marks"".

COMMENTS
Comments are really useful when writting codes as it explains why a certain code was written.
In python this comments are written anywhere in the code using the hash(#) sign

 # this code adds 2+2 and gives the output
        # this is an illustration of how comments work
        a = 2+2
        print(a)
Enter fullscreen mode Exit fullscreen mode

The output
4

Comments are not excecutable when running a code therefore they do not affect the code if written properly
comments do not display in the output
Comments can be written in any language that is understandable by the users of the code

VARIABLES
Variables are containers used to store data values either numericals or textual
Rules when naming variables
variables are case sensitive such as Name is not same as name
Keywords cannot be used as variable names
variables can not contain spaces but instead use underscore.

ARITHMETIC OPERATORS
This operators are used to aid in mathematical calculations.
addittion(+)
subtraction(-)
division(/)
floor divisin(//) divides two numbers and round off the result to the nearest whole number
modulus(%) returns the remainder after division
exponential (**) raises the first number to the power of the second number
multiplication(*)

COMPARISON OPERATORS
== equal to
!= not equal
< less than
<= less than or equal to

greater than
= greater than or equal to

LOGICAL OPERATORS
AND returns true if both statenments are true
OR returns true if one of the statements is true
NOT reverses the result giving true if both statements are false and gives false if both statements are true

CONTROL STRUCTURES
The control structure helps you indictate the flow of you program based on various conditions. The control structures helps in making valid decisions, repeating some actions and also maging the flow of your code excecution.
We will have a look into the following control structures:
1. CONDITIONAL STATEMENTS
if statements- the if statement escecutes a block of code if the specified condition is true.
The output of an if statement is entirely based on the condition being true otherwise it does not output anything.


#example of if statement
         x=10
         if x>5: #changing the comparison sign from > to < it does not give any output considering the condition is false
            print("The value is greater than 5")
Enter fullscreen mode Exit fullscreen mode

output:
The value is greater than 5

if-else statements- The if statement works simmilar to one above but the else works as an alternative of the if statement running if the if statemnt is false.
The if-else statement works well because one of the conditions must be met, that is a condition can only be true or false and neither of both.

 # we will still use the code above but with a different comparison operator
x=10
if x<5:
    print("the value is less than 5")
else:
    print("the value is greater than 5")
Enter fullscreen mode Exit fullscreen mode

output:
the value is greater than 5

if-elif-else statement- The if-elif-elsestatement is an upgrade of the if-else statement because it works on multiple conditions unlike the if-else statement.

# red- stop, yellow-get ready, green-go
# automated traffic light
colour = "green"
if colour== "red":
    print("stop")
elif colour == "yellow":
    print("get ready")
elif colour == "green":
    print("go")
else:
    print("invalid traffic code")

Enter fullscreen mode Exit fullscreen mode

output:
go
The if-elif-statements runs untill a true condition is met and returns the else condition if no condition is true.
2. LOOPS
for loop- a for loop iterates over a sequence such as a list, tuple or a string

 numbers = [1,2,3,4,5,6,]
for num in numbers:
    print(num)
Enter fullscreen mode Exit fullscreen mode

output:
1
2
3
4
5
6
while loop- a while loop excecutes as long as a certain condition.

day = 1
while day <= 7:
    print (day)
    day +=1

Enter fullscreen mode Exit fullscreen mode

output:
1
2
3
4
5
6
7
3. LOOP CONTROL STATEMENTS
BREAK- used to exit the loop immediately at a certain point of the code excecution when the condition is met.

for num in range(10):
    if num == 5: 
        break
        print (num)
Enter fullscreen mode Exit fullscreen mode

output:

0
1
2
3
4
5

CONTINUE- skips the current iteration provided in the condition and proceeds to the next

for num in range(10):
    if num == 5: 
        continue
        print (num)
Enter fullscreen mode Exit fullscreen mode

output:

0
1
2
3
4
6
7
8
9
10

  1. NESTED CONTROL STRUCTURES
num = 10
if num>5:
    for int in range(5):
        print(f"{num} is greater than 5: {int}")
Enter fullscreen mode Exit fullscreen mode

output:
10 is greater than 5: 0
10 is greater than 5: 1
10 is greater than 5: 2
10 is greater than 5: 3
10 is greater than 5: 4

This is just an overview of the introduction to python for data analysis, this forms a great basis of your python journey in data analysis.
In a different article we will look at the different libraries used in data analysis.

. . . .