Random Data

Generating random numbers

This is a random data tutorial.

NumPy provides a well-implemented platform for generating random numbers. There are many different functions within the module.

We explore the generation of random floats and random integers.

import numpy as np

# Generate a random floating-point number between 0 and 1
random_float = np.random.rand()
print(random_float)

# Generate an array of random floating-point numbers between 0 and 1
random_float = np.random.rand(10)
print(random_float)

# Generate an array of random floating-point numbers between 0 and 1 of shape (10, 2)
random_float_array = np.random.rand(10, 2)
print(random_float_array)

# Generate a random integer between 0 and 9
random_integer = np.random.randint(10)
print(random_integer)

# Generate an array of random integers between 0 and 9
random_integer = np.random.randint(10, size=10)
print(random_integer)

# Generate an array of random integers between 0 and 9 of shape (3, 3)
random_array = np.random.randint(10, size=(3, 3))
print(random_array)

Seeding

In NumPy, seed implements a random state instance. In other words, it makes the random generation of numbers predictable.

This means that, we can generate the same random data every time. Seeding the random number generator ensures reproducibility, meaning that running the code with the same seed will produce the same sequence of random numbers. Let’s explore the process with an example.

Generation without seeding.

import numpy as np

random_float = np.random.rand(3)
print(random_float)
# Output: [0.01290456 0.03397538 0.55942502]

random_float = np.random.rand(3)
print(random_float)
# Output: [0.91551117 0.3495945  0.24628688]

The outcomes are different.

Generation with seeding.

import numpy as np

# Set the seed value
np.random.seed(42)

random_float = np.random.rand(3)
print(random_float)
# Output: [0.37454012 0.95071431 0.73199394]

# Set the seed value
np.random.seed(42)

random_float = np.random.rand(3)
print(random_float)
# Output: [0.37454012 0.95071431 0.73199394]

The outcomes are the same.

Random sampling

Random sampling refers to the process of selecting a subset of items or data points from a larger population. It occurs in a way that each item in the population has an equal chance of being selected. It is a fundamental concept in statistics and probability theory.

NumPy has several functions for random sampling. We explore random sample, random choice, and random shuffle.

Let’s dive into the coding part.

Sample() – outputs random datapoints between 0 and 1.

import numpy as np

# outputs 1 random datapoint
random_sample = np.random.sample()
print(random_sample)
# Output: 0.7434698063973832

# outputs multiple random datapoints
random_sample = np.random.sample(3)
print(random_sample)
# Output: [0.89479096 0.09273952 0.6416487]

Choice() – outputs random datapoints from a set of items.

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

# outputs 1 random datapoint
random_choice = np.random.choice(arr)
print(random_choice)
# Output: 2

# 'size' increases the number of datapoints from the set of items.
random_choice = np.random.choice(arr, size=3)
print(random_choice)
# Output: [3 1 5]

# 'p' sets the weight/probability of each datapoint.
random_samples = np.random.choice(arr, size=10, p=[0.1, 0.2, 0.2, 0.4, 0.1])
print(random_samples)
# Output: [1 4 4 2 2 4 4 1 1 2]

Shuffle() – outputs the original set of items randomly shuffled.

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

# the original datapoitns are replaced
random_shuffle = np.random.shuffle(arr)
print(arr)
# Output: [1 5 2 3 4]

This is an original random data educational material created by aicorr.com.

Next: Array Manipulation

by AICorr Team

We are proud to offer our extensive knowledge to you, for free. The AICorr Team puts a lot of effort in researching, testing, and writing the content within the platform (aicorr.com). We hope that you learn and progress forward.

Generating random numbers

Seeding

Random sampling

by AICorr Team

Related Posts

Advanced Topics

Array Iteration

Broadcasting

NumPy