The basic field of Computer Science in Machine Learning, also known as ML in short. Machine Learning is an algorithm or set of programs that analyse a given set of data and predict future behaviour. We can also say, A Machine Learning program is a basic Artificial Intelligence that uses raw data, processes it, and predicts the output.
The main purpose of Machine Learning is to make the system learn from experience, which is in the form of data without explicitly programming it and also without human involvement.
Machine Learning is designed according to a specific algorithm that builds a model using sample data, which can also be called “Training data”. Later, this training data is used to make predictions or decisions without any human or without any hard coding. We all know machines are just dummy objects. They cannot work independently, so to make them work by themselves, the machine learning technique is used.
Why is Machine Learning Required?
As you all know, we, Human Beings are the most evolved and smart species on Earth [There are many life forms out in the universe, and we are still discovering them. So because of this, we cannot say we are the intelligent species in the universe.] Which can think, speak, evolve with time and with this, we can also solve complex tasks.
But at the same time, as I said in the last para, machines are dummy objects. Using many different algorithms, we can supervise the machines to work like humans. So to automate the process, we need to make the machines learn, and it can be performed with the help of the Machine Learning technique.
What is Python?
The first thing is that Python is a General-Purpose Programming Language that comes under High-Level Language. It uses an Interpreter that converts the human-understandable language to machine understandable language. This language was first released in the year 1991, which is designed by Guido Van Russom and developed by Python Software Foundation.
It is a programming language that is Dynamically-Typed and supports different paradigms like Structured, Functional, and Object-Oriented Programming. Python also supports libraries, which are chunks of code that can be used in the program to reduce time.
In this article, we will be using NLTk Library, which stands for Natural Language ToolKit.
Why Machine Learning with Python?
As you all know, Python is a human-friendly language, and it is known as the best programming language by many developers. The code written in python language is compact and readable by anyone, and at the same time, the logic or algorithm behind machine learning is complex.
So because of the readability and compatibility of Python, the developer can invest most of their time in solving the problem instead of focusing on the technical design of the program.
I would like to say Python is more built-in than other programming languages. It provides many frameworks, libraries, and extensions that simply reduce the code length and make it easy to implement. Many developers can work on a single task as Python provides a feature called collaborative implementation.
Types of Learning Methods in Machine Learning with Python
There are various learning methods in Machine Learning like Supervised, Unsupervised, Semi-Supervised and Reinforcement Learning. Let’s see in detail.
This method or algorithm is mostly used in the Machine Learning Process. The main work of this algorithm is to understand the relation between sample input data and output after multiple training.
The process is called supervised because it learns under supervision.
It is further divided into two different classes:
As you can see in the name itself, it says the opposite of supervised machine learning which means there will be no supervision to help the program. This type of algorithm is useful when there is no pre-labelled training data.
It is further divided into different classes based on their task:
- Dimensionality Reduction
- Semi-Supervised Learning:
This type of learning neither falls under supervised or unsupervised. This supervised model is trained based on small data and then builds the unsupervised model by using large unlabelled data—this way, the labelled data is obtained.
It is not similar to the methods that we learnt previously, and it is used seldomly. An agent will be there in this kind of Algorithm that will be trained by us to interact with a specific environment. Certain strategies will be used by it for observing the environment and then taking necessary action towards it.
The reinforcement method follows the following:
Step 1: Initialise an Agent with some set of strategies.
Step 2: Then carefully observe the environment.
Step 3: Next, for the current state of the environment select an optimal policy.
Step 4: Then Agent will be rewarded for it’0s decision making.
Step 5: According to the need you can make changes to the strategies.
Step 6: Repeat the steps at least 4-5 times until the Agent learns to make an optimal decision.
Let’s Build Sentiment Analysis using Machine Learning with Python
Sentiment Analysis is the part of Natural Language Processing, also it is called NPL. It is the most popular application of Natural Language Processing. Using this method, we can determine if the given text is “Positive” or “Negative”.
There will be a third possibility that is “Neutral”, that will be rare case. This method is mostly used to give machines the ability to analyse, what a person thinks about a certain thing.
Let’s begin with the installation of the required things,
- Follow the given link for installation of python on windows: https://docs.python-guide.org/starting/install3/win/#install3-windows
- Open terminal in your favourite code editor like Vs Code and use the following command to install NLTK: ”pip install nltk”
We are all done, let’s move to the programming portion. The program which is mentioned below is taken from the book called “Python Machine Learning Cookbook” by Prateek Joshi and it has been little modified to improve the usability of the program.
Step 1: Creating a Python file with the name “Sentiment_Analysis.py” and importing the libraries.
Importing required librariesimport nltk.classify.util
from nltk.classify import NaiveBayesClassifier
from nltk.corpus import movie_reviews
Importing required libraries
Step 2: Creating the function which will be used to extract the features.
return dict([(word, True) for word in word_list])
Step 3: In this, we will be training the data for movie review which will give a positive or negative reply.
# Load positive and negative reviews
positive_fileids = movie_reviews.fileids(‘pos’)
negative_fileids = movie_reviews.fileids(‘neg’)
Step 4: Separating the review into two categories called “Positive” and “Negative”.
features_positive = [(extract_features(movie_reviews.words(fileids=[f])),
‘Positive’) for f in positive_fileids]
features_negative = [(extract_features(movie_reviews.words(fileids=[f])), ‘Negative’) for f in negative_fileids]
Step 5: In this step, we are segregating the dataset into two different types naming “training dataset” and “testing dataset”.
threshold_factor = 0.8
threshold_positive = int(threshold_factor len(features_positive)) threshold_negative = int(threshold_factor len(features_negative))
Step 6: Here we are extracting the features.
features_train = features_positive[:threshold_positive] + features_negativ
features_test = features_positive[threshold_positive:] + features_negative[threshold_negative:]
print(“\nNumber of training datapoints:”, len(features_train))
print(“Number of test datapoints:”, len(features_test))
Step 7: In this step, we are going to use Naïve Bayes Classifier to the object and train it.
classifier = NaiveBayesClassifier.train(features_train)
print(“\nAccuracy of the classifier:”, nltk.classify.util.accuracy(classifier, features_test))
Step 8: Creating the while loop and taking the user input.
review = input(“Enter Text: “)
Step 9: In this step, we are running the classifier on the user input data and predicting the sentiment.
probdist = classifier.prob_classify(extract_features(review.split()))
pred_sentiment = probdist.max()
Step 10: Here, it will print the predicted sentiment and the probability of positivity and negativity of the user input data.
print(“Predicted sentiment:”, pred_sentiment)
print(“Probability:”, round(probdist.prob(pred_sentiment), 2))
After writing the complete code, save it and run it. It should give the output something like this.
Number of training datapoints: 1600
Number of test datapoints: 400
Accuracy of the classifier: 0.735
After this, it will ask for input from the user, Enter any text related to the movie.
Enter Text: The movie was boring
Predicted sentiment: Negative
Enter Text: The show was awesome
Predicted sentiment: Positive
This brings us to the end of the blog on machine learning with Python. We hope that you are now better acquainted with the concepts. Happy Learning!