ArticlesPython

A Beginner’s Guide to Machine Learning: First Model with Python


Machine learning has become a game-changer in the world of technology, enabling computers to learn from data and make predictions or decisions. If you’re new to the field, fear not! In this tutorial, we’ll walk you through the process of building your first machine learning model using Python. Let’s embark on this exciting journey together.

Prerequisites:

Before we start, make sure you have the following installed on your machine:

  1. Python (preferably version 3.x)
  2. Jupyter Notebook (optional but recommended for an interactive learning experience)
  3. Necessary libraries: scikit-learn and pandas.

Step 1: Installing Required Libraries

This step involves using the Pip tool to install necessary Python libraries. These libraries are essential for machine learning tasks. The required libraries in this tutorial are scikit-learn and pandas.

Open your terminal or command prompt and install the required libraries by running the following commands:

pip install scikit-learn pandas

Step 2: Importing Libraries

In this step, you start your Python script or Jupyter Notebook by importing the libraries you installed in the first step. These libraries will be used for tasks like data manipulation (pandas) and building machine learning models (scikit-learn).

Fire up your favorite Python environment (Jupyter Notebook, Spyder, or a text editor), and let’s begin by importing the necessary libraries:

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report

Step 3: Loading the Dataset

Here, you use scikit-learn to load a pre-existing dataset known as the Iris dataset. This dataset contains information about different species of iris flowers. You create a Pandas DataFrame to better explore and manipulate the data.

For this tutorial, you can load it directly using scikit-learn:

from sklearn.datasets import load_iris

iris = load_iris()
data = pd.DataFrame(data=iris.data, columns=iris.feature_names)
data['target'] = iris.target

Step 4: Exploring the Dataset

This step involves inspecting the dataset to understand its structure, content, and overall information.

Take a peek at the dataset to understand it:

print(data.head())
print(data.info())

Step 5: Splitting the Data

You split the dataset into two parts: a training set and a testing set. The training set is used to train the machine learning model, while the testing set is reserved for evaluating the model’s performance.

Dividing the dataset into training and testing sets:

X_train, X_test, y_train, y_test = train_test_split(data.drop('target', axis=1), data['target'], test_size=0.2, random_state=42)

Step 6: Building the Machine Learning Model

Here, you choose a simple machine learning algorithm, specifically logistic regression, to build your model. You create an instance of the logistic regression model and train it using the training data.

Now, let’s create a simple logistic regression model:

model = LogisticRegression()
model.fit(X_train, y_train)

Step 7: Making Predictions

The trained model is then used to make predictions on the test set.

Making predictions on the test set:

predictions = model.predict(X_test)

Step 8: Evaluating the Model

Finally, you evaluate the performance of your model by comparing its predictions to the actual target values in the test set. Two metrics, accuracy score and classification report, are used for this purpose.

Let’s evaluate the model’s performance:

print(f'Accuracy: {accuracy_score(y_test, predictions)}')
print('\nClassification Report:\n', classification_report(y_test, predictions))

Conclusion

This tutorial guides you through the essential steps of loading data, training a model, and evaluating its performance—a basic but crucial workflow in the realm of machine learning.

Congratulations! You’ve just built and evaluated your first machine learning model. This tutorial provides a solid foundation for understanding the basic workflow of a machine learning project. As you progress, you can explore more complex algorithms, fine-tune your models, and work on diverse datasets to expand your machine learning skills. Happy coding!

Check out our series of Python tutorials.