🤖

Machine Learning in Finance

  • Support Vector Machines → Developed as a classification method within the framework of statistical learning theory.
    • Generalized to handle regressions, both linear and nonlinear, known as Support Vector Regressions.
    • This method looks to do convex optimizations, and geometrically is based on maximum margin hyperplane classifiers.
  • The use of a kernel allows SVM to control non-linear architecture and parameters.

The basis of any machine learning model follows the structure shown below (in its simplest form), using the syntax as defined in scikit-learn.

from sklearn.module import Model
model = Model()
model.fit(X, y)
predictions = model.predict(X_new)
print(predictions)

We’ll look at how we use this syntax to have scikit-learn train a classifier model using K-Nearest Neighbors algorithm.

from sklearn.neighbors import KNeighborsClassifier

# We use .values to convert X and y to arrays
X = churn_df[["total_day_charge", "total_eve_charge"]].values
y = churn_df["churn"].values

knn = KNeighborsClassifier(n_neighbors=15)
knn.fit(X,y)

X_new = np.array

Machine learning is broken down to a few types, most notably: supervised and unsupervised. Focusing on supervised learning, where you train machine learning models with “labeled” data for it to learn from, we have two specific types:

  • Classification → Target variable consists of categories
  • Regression → Target variable is continuous

Classifying labels of unseen data

  • Build a model
  • Model learns from the labeled data we pass to it
  • Pass unlabeled data to the model as input
  • Model predicts the labels of the unseen data
  • Labeled data = training data

Accuracy = # of correct predictions divided by total observations