AWS SageMaker Algorithms
Linear Learner
Linear Learner is a supervised learning algorithm that is used to fit a line to the training data.
It could be used for both classification and regression tasks as follows:
- Regression: output contains continuous numeric values
- Binary classification: output label must be either 0 or 1 (linear threshold function is used)
- Multiclass classification: output labels must be from 0 to num_classes - 1
The best model optimizes either of the following:
- For regression: focus on Continuous metrics such as mean square error, root mean squared error, cross entropy loss, absolute error
- For classification: focus on discrete metrics such as F1 score, precision, recall or accuracy.
Pre-processing
- Ensure that data is shuffled before training
- Normalization or feature scaling is offered by Linear Learner (positive)
- Normalization or feature scaling is a critical pre-processing step to ensure that the model does not become dominated by the weight of a single feature
Training
- Linear Learner uses stochastic gradient descent to perform the training
- Select an appropriate optimization algorithm such as Adam, AdaGrad, and SGD
- Hyper-parameters, such as learning rate can be selected
- Overcome model over-fitting using L1, L2 regularization
Validation
Trained models are evaluated against a validation dataset and best model is selected based on the following metrics:
- For regression: mean square error, root mean squared error, cross entropy loss, absolute error.
- For classification: F1 score, precision, recall, or accuracy.
Linear Learner Input/Output data
Amazon SageMaker linear learner supports the following input data types:
- RecordIO-wrapped protobuf (only Float32 tensors are supported)
- Text/CSV (note: First column assumed to be the target label)
- File or Pipe mode both supported
For inference, linear learner algorithm supports the application/json, application/x-recordio-protobuf, and text/csv formats.
For regression (predictor_type='regressor'), the score is the prediction produced by the model.
Comments
Recent Work
Basalt
basalt.softwareFree desktop AI Chat client, designed for developers and businesses. Unlocks advanced model settings only available in the API. Includes quality of life features like custom syntax highlighting.
BidBear
bidbear.ioBidbear is a report automation tool. It downloads Amazon Seller and Advertising reports, daily, to a private database. It then merges and formats the data into beautiful, on demand, exportable performance reports.