Linear Regression with Pytorch

There are two ways to compute a simple linear regression using Pytorch. One is to use the optimizer update method, and one is to use vanilla update without optimize update. The two approaches are illustrated below.

# Using the optimizer update approach

import torch

learning_rate = 0.001

X = torch.tensor([1.,2.,3,4,5])
y = torch.tensor([0,-1.1,-1.8,-3.1,-4.5])

W = torch.rand(1,requires_grad=True)
b = torch.rand(1,requires_grad=True)

optimizer = torch.optim.SGD([W,b],lr=learning_rate)

for i in range(1000):
    # Model = Prediction
    yhat = X*W+b

    # Loss Function
    loss = (yhat-y).pow(2).sum()

    #Compute gradient
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    print('i=',i,'W=',W.detach().numpy(),'b=',b.detach().numpy(),'loss=',loss.detach().numpy())

# Using Vanilla update method 

import torch
from torch.autograd import Variable

learning_rate = 0.001

X = torch.tensor([1.,2.,3,4,5])
y = torch.tensor([0,-1.1,-1.8,-3.1,-4.5])

W = Variable(torch.randn(1), requires_grad=True)
b = Variable(torch.randn(1), requires_grad=True)
optimizer = torch.optim.SGD([W,b],lr=learning_rate)

for i in range(1000):
    yhat = X*W+b

    loss = (yhat - y).pow(2).sum()
    loss.backward()

    W.data -= learning_rate * W.grad.data
    b.data -= learning_rate * b.grad.data

    W.grad.data.zero_()
    b.grad.data.zero_()
    
    print('i=',i,'W=',W.detach().numpy(),'b=',b.detach().numpy(),'loss=',loss.detach().numpy())

# Without the Variable

import torch
learning_rate = 0.001

X = torch.tensor([1.,2.,3,4,5])
y = torch.tensor([0,-1.1,-1.8,-3.1,-4.5])

W = torch.randn(1, requires_grad=True)
b = torch.randn(1, requires_grad=True)

for i in range(1000):
    yhat = X*W+b

    loss = (yhat - y).pow(2).sum()
    loss.backward()

    W.data -= learning_rate * W.grad.data
    b.data -= learning_rate * b.grad.data

    W.grad.data.zero_()
    b.grad.data.zero_()
    
    print('i=',i,'W=',W.detach().numpy(),'b=',b.detach().numpy(),'loss=',loss.detach().numpy())

In PyTorch, we need to set the gradients to zero before starting to do backpropagation because PyTorch accumulates the gradients on subsequent backward passes. This is convenient while training RNNs.Because of this, when you start your training loop, ideally you should zero out the gradients so that you do the parameter update correctly. Else the gradient would point in some other direction than the intended direction towards the minimum

References:

PyTorch Autograd

Relevant Courses

July 4, 2021

python, pytorch