Data Operations
Data operations
This is a data operations tutorial.
Within this tutorial, data operations refers to the process of element-wise operations. Element-wise operations in Pandas involve performing operations on individual elements of a DataFrame or Series. For instance, adding or multiplying two dataframes. Operators play an important part in element-wise operations.
We cover some of the most common arithmetic and comparison methods. Let’s look at some examples of element-wise operations with scalar values.
Arithmetic operations
Arithmetic operators deal with processes such as addition, subtraction, multiplication, and division.
import pandas as pd # Sample DataFrame df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}) # Addition df_add = df + 7 print(df_add) # Subtraction df_subtract = df - 3 print(df_subtract) # Multiplication df_multiply = df * 3 print(df_multiply) # Division df_divide = df / 4 print(df_divide)
A B 0 8 11 1 9 12 2 10 13 A B 0 -2 1 1 -1 2 2 0 3 A B 0 3 12 1 6 15 2 9 18 A B 0 0.25 1.00 1 0.50 1.25 2 0.75 1.50
Comparison operations
Comparison operators deal with comparing values. This method returns Boolean output (i.e. True or False).
import pandas as pd # Sample DataFrame df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}) # Greater than df_gt = df > 2 print(df_gt) # Less than df_lt = df < 3 print(df_lt) # Equality df_eq = df == 2 print(df_eq)
A B 0 False True 1 False True 2 True True A B 0 True False 1 True False 2 False False A B 0 False False 1 True False 2 False False
Element-wise operations
In this section, we cover element-wise operations of two series or dataframes. This method follows the same logic as with a scalar value. All methods work on both elements within a dataframe as well as two separate dataframes.
Same dataframe
import pandas as pd # Sample DataFrame data = {'A': [1, 2, 3], 'B': [4, 5, 6]} df = pd.DataFrame(data) # Element-wise operation (addition) result = df['A'] + df['B'] print(result)
0 5 1 7 2 9 dtype: int64
Separate dataframes
import pandas as pd # Sample DataFrame data_1 = {'A': [1, 2, 3], 'B': [4, 5, 6]} df_1 = pd.DataFrame(data_1) # Sample DataFrame data_2 = {'A': [3, 5, 7], 'B': [10, 15, 20]} df_2 = pd.DataFrame(data_2) # Element-wise operation (addition) result = df_1 + df_2 print(result)
A B 0 4 14 1 7 20 2 10 26
Applying functions
Pandas provides efficient ways of applying functions to data. There are a two different methods of applying operations, “apply()” and “map()“.
Let’s explore both techniques separately.
Apply()
This method allows you to apply a function along the axis of a dataframe or series. This means you can apply functions row-wise or column-wise. We use the common “sum” function in this scenario.
import pandas as pd # Sample DataFrame data = {'A': [1, 2, 3], 'B': [4, 5, 6]} df = pd.DataFrame(data) # Column-wise (sum function) column_sum = df.apply(sum, axis=0) print(column_sum) # Row-wise (sum functions) row_sum = df.apply(sum, axis=1) print(row_sum)
A 6 B 15 dtype: int64 0 5 1 7 2 9 dtype: int64
Map()
This method applies a function to every item of a dataframe. The function accepts and returns a scalar. We use the combination of map and lambda in this example.
import pandas as pd # Sample Series s = pd.Series([1, 2, 3, 4]) # Map & lambda functions s_mapped = s.map(lambda x: x ** 2) print(s_mapped)
0 1 1 4 2 9 3 16 dtype: int64
This is an original data operations educational material created by aicorr.com.
Next: Grouping and Aggregation