Pandas

Basic Data Operations

Selecting and indexing data

This is a basic data operations tutorial.

Performing basic data operations such as selecting, indexing, and slicing data in Pandas is crucial for data manipulation and analysis tasks. Therefore, understanding these operations is essential for performing more advanced data manipulation tasks efficiently. In this section, we cover very basic operations, such as selecting, indexing, and slicing.

Series

Selecting data from a Pandas Series involves accessing specific elements or subsets of elements based on their index labels, positions, or certain conditions.

Series – select data through indexing

import pandas as pd

# Series
s = pd.Series([10, 20, 30, 40, 50], index=['A', 'B', 'C', 'D', 'E'])

# Selecting a single element
print(s[1])

# Selecting multiple elements
print(s[[0, 4]])
20
A    10
E    50
dtype: int64

Series – select data through label indexing

import pandas as pd

# Series
s = pd.Series([10, 20, 30, 40, 50], index=['A', 'B', 'C', 'D', 'E'])

# Selecting a single element
print(s['B'])

# Selecting multiple elements
print(s[['A', 'E']])
20
A    10
E    50
dtype: int64

DataFrame

Selecting data from a Pandas DataFrame involves accessing specific rows, columns, or subsets of data based on various criteria such as index labels, positions, or conditions.

DataFrame– select data through label indexing (column name)

import pandas as pd

# DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})

# Selecting a single column
print(df['A'])

# Selecting multiple columns
print(df[['A', 'C']])

# Select specific value
print(df['A'][1])
0    1
1    2
2    3
Name: A, dtype: int64

   A  C
0  1  7
1  2  8
2  3  9

2

DataFrame– select data through indexing (loc & iloc methods)

# Selecting rows by index label
print(df.loc[0])  # Select row with index label 0

# Selecting rows by position
print(df.iloc[0])  # Select first row
A    1
B    4
C    7
Name: 0, dtype: int64

A    1
B    4
C    7
Name: 0, dtype: int64

DataFrame– select rows and columns simultaneously

import pandas as pd

# DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})

# Selecting specific rows and columns
print(df.loc[[0, 2], ['A', 'B']])
   A  B
0  1  4
2  3  6

Slicing data

In Pandas, slicing is very straightforward. Let’s explore both series and dataframes. Slicing data from a Pandas Series involves selecting a subset of elements based on their index labels or positions. Whereas, slicing data from a Pandas DataFrame involves selecting a subset of rows and/or columns based on their index labels, positions, or certain conditions.

Series

import pandas as pd

# DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})

# Slicing Series by index label
print(s['B':'D'])

# Slicing Series by position
print(s[1:4])
B    20
C    30
D    40
dtype: int64

B    20
C    30
D    40
dtype: int64

DataFrame

import pandas as pd

# DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})

# Slicing DataFrame by rows and columns
print(df.loc[0:1, 'A':'B'])  # Slice rows from 0 to 1 and columns from 'A' to 'B'
   A  B
0  1  4
1  2  5

This is an original basic data operations educational material created by aicorr.com.

Next: Data Loading and Handling

by AICorr Team

We are proud to offer our extensive knowledge to you, for free. The AICorr Team puts a lot of effort in researching, testing, and writing the content within the platform (aicorr.com). We hope that you learn and progress forward.