Pandas is the most popular python library that is used for data analysis. It provides highly optimized performance with back-end source code is purely written in C or Python.
We can analyze data in pandas with:
- Series
- DataFrames
Series:
Series is one dimensional(1-D) array defined in pandas that can be used to store any data type.
Code #1: Creating Series
# Program to create series import pandas as pd # Import Panda Library # Create series with Data, and Index a = pd.Series(Data, index = Index) |
Here, Data can be:
- A Scalar value which can be integerValue, string
- A Python Dictionary which can be Key, Value pair
- A Ndarray
Note: Index by default is from 0, 1, 2, …(n-1) where n is length of data.
Code #2: When Data contains scalar values
# Program to Create series with scalar values Data = [ 1 , 3 , 4 , 5 , 6 , 2 , 9 ] # Numeric data # Creating series with default index values s = pd.Series(Data) # predefined index values Index = [ 'a' , 'b' , 'c' , 'd' , 'e' , 'f' , 'g' ] # Creating series with predefined index values si = pd.Series(Data, Index) |
Output:
Scalar Data with default Index
Scalar Data with Index
Code #3: When Data contains Dictionary
# Program to Create Dictionary series dictionary = { 'a' : 1 , 'b' : 2 , 'c' : 3 , 'd' : 4 , 'e' : 5 } # Creating series of Dictionary type sd = pd.Series(dictionary) |
Output:
Dictionary type data
Code #4:When Data contains Ndarray
# Program to Create ndarray series Data = [[ 2 , 3 , 4 ], [ 5 , 6 , 7 ]] # Defining 2darray # Creating series of 2darray snd = pd.Series(Data) |
Output:
Data as Ndarray
DataFrames:
DataFrames is two-dimensional(2-D) data structure defined in pandas which consists of rows and columns.
Code #1: Creation of DataFrame
# Program to Create DataFrame import pandas as pd # Import Library a = pd.DataFrame(Data) # Create DataFrame with Data |
Here, Data can be:
- One or more dictionaries
- One or more Series
- 2D-numpy Ndarray
Code #2: When Data is Dictionaries
# Program to Create Data Frame with two dictionaries dict1 = { 'a' : 1 , 'b' : 2 , 'c' : 3 , 'd' : 4 } # Define Dictionary 1 dict2 = { 'a' : 5 , 'b' : 6 , 'c' : 7 , 'd' : 8 , 'e' : 9 } # Define Dictionary 2 Data = { 'first' :dict1, 'second' :dict2} # Define Data with dict1 and dict2 df = pd.DataFrame(Data) # Create DataFrame |
Output:
DataFrame with two dictionaries
Code #3: When Data is Series
# Program to create Dataframe of three series import pandas as pd s1 = pd.Series([ 1 , 3 , 4 , 5 , 6 , 2 , 9 ]) # Define series 1 s2 = pd.Series([ 1.1 , 3.5 , 4.7 , 5.8 , 2.9 , 9.3 ]) # Define series 2 s3 = pd.Series([ 'a' , 'b' , 'c' , 'd' , 'e' ]) # Define series 3 Data = { 'first' :s1, 'second' :s2, 'third' :s3} # Define Data dfseries = pd.DataFrame(Data) # Create DataFrame |
Output:
DataFrame with three series
Code #4: When Data is 2D-numpy ndarray
Note: One constraint has to be maintained while creating DataFrame of 2D arrays – Dimensions of 2D array must be same.
# Program to create DataFrame from 2D array import pandas as pd # Import Library d1 = [[ 2 , 3 , 4 ], [ 5 , 6 , 7 ]] # Define 2d array 1 d2 = [[ 2 , 4 , 8 ], [ 1 , 3 , 9 ]] # Define 2d array 2 Data = { 'first' : d1, 'second' : d2} # Define Data df2d = pd.DataFrame(Data) # Create DataFrame |
Output:
DataFrame with 2d ndarray
This article is attributed to GeeksforGeeks.org
leave a comment
0 Comments