Magenta Protocol
WebsiteTwitter
  • 👋Welcome to Magenta AI
  • 🔗Official Links
  • Magenta AI ANALYSOR
    • 📖Set Up Magenta
    • 📚Features
      • ↳ /analyse
      • ↳ /chart
      • ↳ /deployer
      • ↳ /website
    • 💰Packages
    • 🤖Magenta Learning Algorithm (MLA)
      • 🤖MLA Code #
    • 🪙$MGNT
Powered by GitBook
On this page
  1. Magenta AI ANALYSOR
  2. Magenta Learning Algorithm (MLA)

MLA Code #

PreviousMagenta Learning Algorithm (MLA)Next$MGNT

Last updated 1 year ago

Let’s get to the code. We have two choices, we can either use the scikit learn library to import the linear regression model and use it directly or we can write our own regression model based on the equations above. Instead of choosing one among the two, let’s do both :)

There are many datasets available online for linear regression. I used the one from this . Let’s visualise the training and testing data.

import pandas as pd
import numpy as np

df_train = pd.read_csv('/Users/{redacted}/Documents/Datasets/Linear_Regression/train.csv')
df_test = pd.read_csv('/Users/{redacted}/Documents/Datasets/Linear_Regression/test.csv')

x_train = df_train['x']
y_train = df_train['y']
x_test = df_test['x']
y_test = df_test['y']

x_train = np.array(x_train)
y_train = np.array(y_train)
x_test = np.array(x_test)
y_test = np.array(y_test)

x_train = x_train.reshape(-1,1)
x_test = x_test.reshape(-1,1)

We use pandas library to read the train and test files. We retrieve the independent(x) and dependent(y) variables and since we have only one feature(x) we reshape them so that we could feed them into our linear regression model.

from sklearn.linear_model import LinearRegression 
from sklearn.metrics import r2_score

clf = LinearRegression(normalize=True)
clf.fit(x_train,y_train)
y_pred = clf.predict(x_test)
print(r2_score(y_test,y_pred))

Now, let’s build our own linear regression model from the equations above. We will be using only numpy library for the computations and the R2 score for metrics.

## Linear Regression 
import numpy as np

n = 700
alpha = 0.0001

a_0 = np.zeros((n,1))
a_1 = np.zeros((n,1))

epochs = 0
while(epochs < 1000):
    y = a_0 + a_1 * x_train
    error = y - y_train
    mean_sq_er = np.sum(error**2)
    mean_sq_er = mean_sq_er/n
    a_0 = a_0 - alpha * 2 * np.sum(error)/n 
    a_1 = a_1 - alpha * 2 * np.sum(error * x_train)/n
    epochs += 1
    if(epochs%10 == 0):
        print(mean_sq_er)

We initialize the value 0.0 for a_0 and a_1. For 1000 epochs we calculate the cost, and using the cost we calculate the gradients, and using the gradients we update the values of a_0 and a_1. After 1000 epochs, we would’ve obtained the best values for a_0 and a_1 and hence, we can formulate the best fit straight line.

import matplotlib.pyplot as plt 

y_prediction = a_0 + a_1 * x_test
print('R2 Score:',r2_score(y_test,y_prediction))

y_plot = []
for i in range(100):
    y_plot.append(a_0 + a_1 * i)
plt.figure(figsize=(10,10))
plt.scatter(x_test,y_test,color='red',label='GT')
plt.plot(range(len(y_plot)),y_plot,color='black',label = 'pred')
plt.legend()
plt.show()

The test set contains 300 samples, therefore we have to reshape a_0 and a_1 from 700x1 to 300x1. Now, we can just use the equation to predict values in the test set and obtain the R2 score.

We can observe the same R2 score as the previous method. We also plot the regression line along with the test data points to get a better visual understanding of how good our algorithm works.

We use scikit learn to import the linear regression model. we fit the model on the training data and predict the values for the testing data. We use to measure the accuracy of our model.

🤖
🤖
link
R2 score
R2 Score
R2 Score
Regression Line - Test Data