Open In Colab

  • There is such an interest question I encountered during my job interview. The question is about to find a optimal point in a plane that has smallest geometric distance summation to every given data point on the plane
import numpy as np
import torch
from matplotlib import pyplot as plt
  • So here I just randomly emulate the given data point on a 2D plane
points = np.random.rand(100, 2)
plt.scatter(points[:, 0], points[:, 1])
plt.show()

png

  • Now because I am trying to visualize the work being done by SGD, so I am going to use an PyTorch optimizer to help me update the parameter. Which in my case is the optimal point $ (w_x, w_y) $
  • In the below graph I am using a red dot to represent the initialization of the to-be-optimized value
param_w = np.array([[0.1, 0.1]])
plt.scatter(param_w[:, 0], param_w[:, 1], color='r')
plt.scatter(points[:, 0], points[:, 1])
plt.show()

png

  • So we have the loss function as intuitively $ loss = \frac{\sum_n\sqrt{(x_i - w_x)^2 + (y_i-w_y)^2}}{n} $
param_w = np.array([[0.1, 0.1]])
tensor = torch.from_numpy(param_w)
parameters = torch.nn.Parameter(tensor)
optimizer = torch.optim.SGD([parameters], lr=0.05)
updates = [parameters.detach().clone()]
for i in range(100):
    loss = torch.mean(torch.linalg.norm(torch.from_numpy(points) - parameters, dim=1))
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    updates.append(parameters.detach().clone())
updated = np.array(updates)
updated = np.squeeze(updated, axis=1)
updated.shape
plt.scatter(updated[:, 0], updated[:, 1], color='r')
plt.scatter(points[:, 0], points[:, 1])
plt.show()

png

  • So from the above I verified:
    • You don’t need a neural net model to see the SGD work
    • I do utilize torch’s optimizer in a “non traditional” way so that I learned a lot from doing it