Python Pandas DataFrame: A Comprehensive Guide + 13 Ex

Created OnFebruary 17, 2023

Pandas is a software library written for the Python programming language that offers data structures and operations for manipulating numerical tables and time series. In particular, Pandas supports data manipulation and analysis of tables and time series.

Table of Contents

What is Python Pandas DataFrame?
How to Create a Basic Pandas DataFrame
How To Create DataFrame From dict ,arrays, and lists
How To Select One Or More Column In Pandas DataFrame
How To Select One or More Rows In Pandas Data Frame
How to Iterate Over The Rows In Pandas Data Frame
How To Iterate Over Columns In Pandas Data Frame
Conclusion

What is Python Pandas DataFrame?

A DataFrame is a 2-dimensional labeled data structure with columns of different or similar types. You can think of it as a spreadsheet, SQL table, or a dict of Series objects. It is generally the most commonly used pandas object. You can optionally pass index (row labels) and column (column labels) arguments along with the data. If you pass an index or columns, you guarantee the index or columns of the resulting DataFrame. The data frames are great tools for data visualization and machine learning models.

[adinserter block=”1″]

This article will explain the python pandas data frame in detail.

How to Create a Basic Pandas DataFrame

Creating data frames in the pandas library is very easy. However, there is an enormous number of ways to make data frames. This may range from creating it manually or using the DataFrame function.

Example (1)

import pandas as pd

# Creating a DataFrame from a dictionary
data = {'Country': ['Belgium', 'India', 'Brazil'],
        'Capital': ['Brussels', 'New Delhi', 'Brasília'],
        'Population': [11190846, 1303171035, 207847528]}
df = pd.DataFrame(data)
print(df)

Output:

	Country	Capital	Population
0	Belgium	Brussels	11190846
1	India	New Delhi	1303171035
2	Brazil	Brasília	207847528

Observe that we haven’t provided any indexing to the data frame, but still, we have indexed in the output. Pandas data frame automatically creates the indexing for us for each row.

[adinserter block=”2″]

How To Create DataFrame From dict ,arrays, and lists

You can create a Pandas DataFrame from a variety of structures in Python, including

Dictionaries
NumPy arrays
Lists

Here’s an example of creating a DataFrame from a dictionary:

Example (2)

import pandas as pd

# Creating a DataFrame from a dictionary
data = {'Country': ['Belgium', 'India', 'Brazil'],
        'Capital': ['Brussels', 'New Delhi', 'Brasília'],
        'Population': [11190846, 1303171035, 207847528]}
df = pd.DataFrame(data)
print(df)

This would produce a DataFrame with 3 rows and 3 columns, with the column names ‘Country’, ‘Capital’, and ‘Population’.

Output:

[adinserter block=”3″]

	Country	Capital	Population
0	Belgium	Brussels	11190846
1	India	New Delhi	1303171035
2	Brazil	Brasília	207847528

Here’s an example of creating a DataFrame from a NumPy array:

Example (3)

import numpy as np
import pandas as pd

# Creating a DataFrame from a NumPy array
a = np.array([[1, 2, 3], [4, 5, 6]])
df = pd.DataFrame(a, columns=['a', 'b', 'c'])
print(df)

Output:

	a	b	c
0	1	2	3
1	4	5	6

This would create a DataFrame with 2 rows and 3 columns, with column names ‘a’, ‘b’, and ‘c’.

And here’s an example of creating a DataFrame from a list:

Example (4)

[adinserter block=”4″]

import pandas as pd

# Creating a DataFrame from a list
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'])
print(df)

Output:

	Name	Age
0	Alex	10
1	Bob	12
2	Clarke	13

This would create a DataFrame with 3 rows and 2 columns, with column names ‘Name’ and ‘Age’.

How To Select One Or More Column In Pandas DataFrame

To select one or multiple columns from a Pandas DataFrame, you can use the [] operator along with the column name(s).

Example (5)

import pandas as pd

# Create a sample dataframe
data = {'Product': ['Table', 'Chair', 'Sofa', 'Bed'],
        'Price': [100, 50, 300, 200],
        'Quantity': [2, 4, 1, 3]}
df = pd.DataFrame(data)

# Select the 'Product' and 'Quantity' columns
product_and_quantity = df[['Product', 'Quantity']]

# Print the first 3 rows of the 'Product' and 'Quantity' columns
print(product_and_quantity)

Output:

	Product	Quantity
0	Table	2
1	Chair	4
2	Sofa	1
3	Bed	3

[adinserter block=”5″]

In the above example, we used the Product and Quantity columns to get the two columns out of the data frame. You are free to choose any columns.

How To Select One or More Rows In Pandas Data Frame

Selection of the rows is also a relatively easy task. One can perform the same either through the indexing method or by the label names.

To select rows in a Pandas DataFrame, you can use one of the following methods:

df.loc[]: Selects rows by label

df.iloc[]: Selects rows by index

Here’s an example using the df.loc[] method to select rows in a Pandas DataFrame:

Example (6)

import pandas as pd

# Create a sample dataframe
data = {'Product': ['Table', 'Chair', 'Sofa', 'Bed'],
        'Price': [100, 50, 300, 200],
        'Quantity': [2, 4, 1, 3]}
df = pd.DataFrame(data)

# Select rows with index labels '0' and '2'
selected_rows = df.loc[['0', '2']]

# Print the selected rows
print(selected_rows)

This would print the rows with index labels ‘0’ and ‘2’:

[adinserter block=”6″]

Output:

	Product	Price	Quantity
0	Table	100	2
2	Sofa	300	1

Here’s an example using the df.iloc[] method to select rows in a Pandas DataFrame:

Example (7)

import pandas as pd

# Create a sample dataframe
data = {'Product': ['Table', 'Chair', 'Sofa', 'Bed'],
        'Price': [100, 50, 300, 200],
        'Quantity': [2, 4, 1, 3]}
df = pd.DataFrame(data)

# Select rows with indices 0 and 2
selected_rows = df.iloc[[0, 2]]

# Print the selected rows
print(selected_rows)

This would also print the rows with indices 0 and 2:

Output:

	Product	Price	Quantity
0	Table	100	2
2	Sofa	300	1

You can also use boolean indexing to select rows in a DataFrame based on a condition. For example:

[adinserter block=”7″]

Example (8)

import pandas as pd

# Create a sample dataframe
data = {'Product': ['Table', 'Chair', 'Sofa', 'Bed'],
        'Price': [100, 50, 300, 200],
        'Quantity': [2, 4, 1, 3]}
df = pd.DataFrame(data)

# Select rows where the price is greater than 100
selected_rows = df[df['Price'] > 100]

# Print the selected rows
print(selected_rows)

This would select and print the rows where the price is greater than 100:

Output:

	Product	Price	Quantity
2	Sofa	300	1
3	Bed	200	3

Handling Missing Data In Pandas DataFrame

In a Pandas DataFrame, missing data is represented as NaN (not a number). There are several ways to handle missing data in a Pandas DataFrame:

Drop rows with missing data: You can use the dropna() function to drop rows that contain missing data. For example:

[adinserter block=”8″]

Example (9)

import pandas as pd

# Create a sample dataframe with missing data
data = {'Product': ['Table', 'Chair', np.nan, 'Bed'],
        'Price': [100, 50, np.nan, 200],
        'Quantity': [2, 4, 1, 3]}
df = pd.DataFrame(data)

# Drop rows with missing data
df.dropna(inplace=True)

# Print the resulting dataframe
print(df)

This would drop the row with the missing data and print the resulting DataFrame:

Output:

	Product	Price	Quantity
0	Table	100.0	2
1	Chair	50.0	4
3	Bed	200.0	3

Fill missing data with a placeholder value: You can use the fillna() function to fill missing data with a placeholder value. For example:

Example (10)

import pandas as pd

# Create a sample dataframe with missing data
data = {'Product': ['Table', 'Chair', np.nan, 'Bed'],
        'Price': [100, 50, np.nan, 200],
        'Quantity': [2, 4, 1, 3]}
df = pd.DataFrame(data)

# Fill missing data with 0
df.fillna(0, inplace=True)

# Print the resulting dataframe
print(df)
This would fill the missing data with 0 and print the resulting DataFrame:

Output:

[adinserter block=”9″]

	Product	Price	Quantity
0	Table	100.0	2
1	Chair	50.0	4
2	0	0.0	1
3	Bed	200.0	3

Interpolate missing data: You can use the interpolate() function to interpolate missing data based on neighboring values. For example:

Example (11)

import pandas as pd

# Create a sample dataframe with missing data
data = {'Product': ['Table', 'Chair', np.nan, 'Bed'],
        'Price': [100, 50, np.nan, 200],
        'Quantity': [2, 4, 1, 3]}
df = pd.DataFrame(data)

# Interpolate missing data
df.interpolate(inplace=True)

# Print the resulting dataframe
print(df)

This would interpolate the missing data based on the neighboring values and print the resulting DataFrame:

Output:

	Product	Price	Quantity
0	Table	100.00	2
1	Chair	50.00	4
2	Chair	75.00	1
3	Bed	200.0	3

Observe that in the above example, Python has automatically assigned some value to the missing values in the data frame.

[adinserter block=”10″]

How to Iterate Over The Rows In Pandas Data Frame

Iteration over the rows in the pandas data frame is pretty straightforward. We need to use the iterrow function of pandas. We can get the individual cell values with the help of the name of the columns too. The below example illustrates the same:

Example (12)

import pandas as pd

# Create a sample dataframe
data = {'Product': ['Table', 'Chair', 'Sofa', 'Bed'],
        'Price': [100, 50, 300, 200],
        'Quantity': [2, 4, 1, 3]}
df = pd.DataFrame(data)

# Iterate over the rows
for index, row in df.iterrows():
    # Calculate the total price for the row
    total_price = row['Price'] * row['Quantity']

    # Print the product name and total price
    print(row['Product'], total_price)

This would print the product name and total price for each row:

[adinserter block=”1″]

Copy	code
Table	200
Chair	200
Sofa	300
Bed	600

How To Iterate Over Columns In Pandas Data Frame

Just like the previous example, we can also iterate the columns in the pandas using the iteritems function of pandas.

Example (13)

import pandas as pd

# Create a sample dataframe
data = {'Country': ['USA', 'India', 'China'],
        'Capital': ['Washington', 'New Delhi', 'Beijing'],
        'Population': [ 331002651, 1303171035, 207847528]}
df = pd.DataFrame(data)

# Iterate over the columns
for column_name, column in df.iteritems():
    # Print the column name and values
    print(column_name)
    print(column)

Output:

	Country	Capital	Population
0	Belgium	Brussels	11190846
1	India	New Delhi	1303171035
2	Brazil	Brasília	207847528

	a	b	c
0	1	2	3
1	4	5	6

	Country	Capital	Population
0	Belgium	Brussels	11190846
1	India	New Delhi	1303171035
2	Brazil	Brasília	207847528

	Name	Age
0	Alex	10
1	Bob	12
2	Clarke	13

	Product	Quantity
0	Table	2
1	Chair	4
2	Sofa	1
3	Bed	3

[adinserter block=”2″]

[‘Belgium’ ‘India’ ‘Brazil’]

	Country
0	USA
1	India
2	China

Name: Country, dtype: object

	Capital
0	Washington
1	New Delhi
2	Beijing

Name: Capital, dtype: object

	Population
0	331002651
1	1303171035
2	207847528

Name: Population, dtype: int64

[adinserter block=”3″]

Conclusion

In this article, we have understood in detail the python pandas data frame. We strongly recommend that readers look up the python pandas library and have more understanding of the topic by going through more examples and illustrations. We also encourage readers to post their queries in our Oraask community.

Last Updated OnAugust 18, 2023

byAsif Rahaman

Python Pandas DataFrame: A Comprehensive Guide + 13 Ex

How Can We Help?

What is Python Pandas DataFrame?

How to Create a Basic Pandas DataFrame

Example (1)

Output:

How To Create DataFrame From dict ,arrays, and lists

Example (2)

Output:

Example (3)

Output:

Example (4)

Output:

How To Select One Or More Column In Pandas DataFrame

Example (5)

Output:

How To Select One or More Rows In Pandas Data Frame

Example (6)

Output:

Example (7)

Output:

Example (8)

Output:

Example (9)

Output:

Example (10)

Output:

Example (11)

Output:

How to Iterate Over The Rows In Pandas Data Frame

Example (12)

How To Iterate Over Columns In Pandas Data Frame

Example (13)

Output:

Conclusion

Asif Rahaman

Leave a comment
Cancel reply

Sign Up

Sign In

Forgot Password

How Can We Help?

What is Python Pandas DataFrame?

How to Create a Basic Pandas DataFrame

Example (1)

Output:

How To Create DataFrame From dict ,arrays, and lists

Example (2)

Output:

Example (3)

Output:

Example (4)

Output:

How To Select One Or More Column In Pandas DataFrame

Example (5)

Output:

How To Select One or More Rows In Pandas Data Frame

Example (6)

Output:

Example (7)

Output:

Example (8)

Output:

Example (9)

Output:

Example (10)

Output:

Example (11)

Output:

How to Iterate Over The Rows In Pandas Data Frame

Example (12)

How To Iterate Over Columns In Pandas Data Frame

Example (13)

Output:

Conclusion

Asif Rahaman

Related Posts

Python Pandas Date: Parsing, Arithmetic, and Resampling

Exploring Python Pandas Options: Your Data Power Tool

Python Pandas Join: Where Datasets Converge

Python Pandas Categorize Data: All You NEED to Know

Leave a commentCancel reply

Leave a comment
Cancel reply