ml_algo 7.2.0

  • README.md
  • CHANGELOG.md
  • Example
  • Installing
  • Versions
  • 65

Build Status Coverage Status pub package Gitter Chat

Machine learning algorithms with dart #

Table of contents

What is the ml_algo for? #

The main purpose of the library - to give developers, interested both in Dart language and data science, native Dart implementation of machine learning algorithms. This library targeted to dart vm, so, to get smoothest experience with the lib, please, do not use it in a browser.

Following algorithms are implemented:

  • Linear regression:

    • Gradient descent algorithm (batch, mini-batch, stochastic) with ridge regularization
    • Lasso regression (feature selection model)
  • Linear classifier:

    • Logistic regression (with "one-vs-all" multiclass classification)

The library's structure #

To provide main purposes of machine learning, the library exposes the following classes:

Usage #

Real life example #

Let's classify records from well-known dataset - Pima Indians Diabets Database via Logistic regressor

Import all necessary packages:

import 'dart:async';

import 'package:ml_algo/ml_algo.dart';

Read csv-file pima_indians_diabetes_database.csv with test data. You can use csv from the library's datasets directory:

final data = MLData.fromCsvFile('datasets/pima_indians_diabetes_database.csv');
final features = await data.features;
final labels = await data.labels;

Data in this file is represented by 768 records and 8 features. Processed features are contained in a data structure of MLMatrix type and processed labels are contained in a data structure of MLVector type. To get more information about these types, please, visit ml_linal repo

Then, we should create an instance of CrossValidator class for fitting hyperparameters of our model

final validator = CrossValidator.KFold();

All are set, so, we can perform our classification. For better hyperparameters fitting, let's create a loop in order to try each value of a chosen hyperparameter in a defined range:

final step = 0.001;
final limit = 0.6;
double minError = double.infinity;
double bestLearningRate = 0.0;
for (double rate = step; rate < limit; rate += step) {
  // ...
}

Let's create a logistic regression classifier instance with stochastic gradient descent optimizer in the loop's body:

final logisticRegressor = LinearClassifier.logisticRegressor(
        iterationsLimit: 100,
        initialLearningRate: rate,
        learningRateType: LearningRateType.constant);

Evaluate our model via accuracy metric:

final error = validator.evaluate(logisticRegressor, featuresMatrix, labels, MetricType.accuracy);
if (error < minError) {
  minError = error;
  bestLearningRate = rate;
}

Let's print score:

print('best error on classification: ${(minError * 100).toFixed(2)}');
print('best learning rate: ${bestLearningRate.toFixed(3)}');

Best model parameters search takes much time so far, so be patient. After the search is over, we will see something like this:

best error on classification: 35.5%
best learning rate: 0.155

All the code above all together:

import 'dart:async';

import 'package:ml_algo/ml_algo.dart';

Future<double> logisticRegression() async {
  final data = CsvMLData.fromFile('datasets/pima_indians_diabetes_database.csv');
  final features = await data.features;
  final labels = await data.labels;

  final validator = CrossValidator.kFold(numberOfFolds: 7);

  final step = 0.001;
  final limit = 0.6;

  double minError = double.infinity;
  double bestLearningRate = 0.0;

  for (double rate = step; rate < limit; rate += step) {
    final logisticRegressor = LinearClassifier.logisticRegressor(
      iterationsLimit: 100,
      initialLearningRate: rate,
      learningRateType: LearningRateType.constant);
    final error = validator.evaluate(logisticRegressor, features, labels, MetricType.accuracy);
    if (error < minError) {
      minError = error;
      bestLearningRate = rate;
    }
  }

  print('best error on classification: ${(minError * 100).toFixed(2)}');
  print('best learning rate: ${bestLearningRate.toFixed(3)}');
}

For more examples please see examples folder

Contacts #

If you have questions, feel free to write me on

Changelog #

7.2.0 #

  • SoftmaxMapper added (aka Softmax activation function)

7.1.0 #

  • ConvergenceDetector added (this entity stops the optimizer when it is needed)

7.0.0 #

  • All the exports packed into ml_algo entry

6.2.0 #

  • Coefficients in optimizers now are a matrix
  • InitialWeightsGenerator instantiating fixed: dtype is passed now

6.1.0 #

  • LinkFunction renamed to ScoreToProbMapper
  • ScoreToProbMapper accepts vector and returns vector instead of a scalar

6.0.6 #

  • Pedantic package integration added
  • Some linter issues fixed

6.0.5 #

  • Coveralls integration added
  • dartfm check task added

6.0.4 #

  • Documentation for linear regression corrected
  • Documentation for MLData corrected

6.0.3 #

  • Documentation for logistic regression corrected

6.0.2 #

  • Tests corrected: removed import test_api.dart

6.0.1 #

  • Readme corrected

6.0.0 #

  • Library fully refactored:
    • add possibility to set certain data type for numeric computations
    • all algorithms code now is more generic
    • a lot of unit tests added
    • bug fixes

5.2.0 #

  • Ordinal encoder added
  • Float32x4CsvMlData significantly extended

5.1.0 #

  • Real-life example added (black friday dataset)
  • rows parameter added to Float32x4CsvMlData
  • Unknown categorical values handling strategy types added

5.0.0 #

  • One hot encoder integrated into CSV ML data

4.3.3 #

  • Performance test for one hot encoder added

4.3.2 #

  • One hot encoder implemented

4.3.1 #

  • enum for categorical data encoding added

4.3.0 #

  • Cross validator factory added
  • README updated

4.2.0 #

  • csv-parser added

4.1.0 #

  • ml_linalg removed from export file
  • README refreshed
  • General datasets directory created

4.0.0 #

  • ml_linal ^4.0.0 supported

3.5.4 #

  • README.md updated
  • build_runner dependency updated

3.5.3 #

  • dartfmt tool applied to all necessary files

3.5.2 #

  • Travis configuration file name corrected

3.5.1 #

  • Travis integration added

3.5.0 #

  • Vectorized cost functions applied

3.4.0 #

  • ml_linalg 2.0.0 supported

3.3.0 #

  • Matrix-based gradient calculation added for log likelihood cost function

3.2.0 #

  • Matrix-based gradient calculation added for squared cost function

3.1.2 #

  • Description corrected

3.1.1 #

  • dartfm tool applied

3.1.0 #

  • Get rid of MLVector's deprecated methods

3.0.0 #

  • Library public release

2.0.0 #

  • ml_linalg supported

1.2.1 #

  • subVector -> subvector

1.2.0 #

  • Matrices support added

1.1.1 #

  • Examples fixed, dependencies fixed

1.1.0 #

  • Support of updated linalg package

1.0.1 #

  • Readme updated, dependencies fixed

1.0.0 #

  • Migration to dart 2.0

0.38.1 #

0.38.0 #

  • Lasso solution refactored

0.37.0 #

  • Support of linalg package (former simd_vector)

0.36.0 #

  • Intercept term considered (fitIntercept and interceptScale parameters)

0.35.1 #

  • Logistic regression tests improved

0.35.0 #

  • One versus all refactored, tests for logistic regression added

0.34.0 #

  • One versus all classifier

0.33.0 #

  • Gradient descent regressor type enum added

0.32.1 #

  • Gradient optimizer unit tests

0.32.0 #

  • Get rid of derivative computation

0.31.0 #

  • Get rid of di package usage

0.30.1 #

  • File structure flattened

0.30.0 #

  • Redundant gradient optimizers removed

0.29.0 #

  • part ... part of directives removed

0.28.0 #

  • Coordinate descent optimizer added
  • Lasso regressor added

0.27.0 #

  • Gradient calculation changed

0.26.1 #

  • Code was optimized (removed unnecessary)
  • Refactoring

0.26.0 #

  • More distinct modularity was added to the library
  • Unit tests were fixed

0.25.0 #

  • Tests for gradient optimizers were added
  • Gradient calculator was created as a separate entity
  • Initial weights generator was created as a separate entity
  • Learning rate generator was created as a separate entity

0.24.0 #

  • All implementations were hidden

0.23.0 #

  • findMaxima and findMinima methods were added to Optimizer interface

0.22.0 #

  • File structure reorganized, predictor classes refactored
  • README.md updated

0.21.0 #

  • Logistic regression model added (with example)

0.20.2 #

  • README.md updated

0.20.1 #

  • simd_vector dependency url fixed

0.20.0 #

  • Repository dependency corrected (dart_vector -> simd_vector)

0.19.0 #

  • Support for Float32x4Vector class was added (from dart_vector library)
  • Type List for label (target) list replaced with Float32List (in Predictor.train() and Optimizer.optimize())

0.18.0 #

  • class Vector and enum Norm were extracted to separate library (https://github.com/gyrdym/dart_vector.git)

0.17.0 #

  • Common interface for loss function was added
  • Derivative calculation was fixed (common canonical method was used)
  • Squared loss function was added as a separate class

0.16.0 #

  • README.md was actualized

0.15.0 #

  • Tests for gradient optimizers were added
  • Interfaces (almost for all entities) for DI and IOC mechanism were added
  • Randomizer class was added
  • Removed separate classes for k-fold cross validation and lpo cross validation, now it resides in CrossValidation class

0.14.0 #

  • L1 and L2 regularization added

0.13.0 #

  • Script for running all unit tests added

0.12.0 #

  • Vector interface removed
  • Regular vector implementation removed
  • TypedVector -> Vector
  • Implicit vectors constructing replaced with explicit new-instantiation

0.11.0 #

  • Entity names correction

0.10.0 #

  • K-fold cross validation added (KFoldCrossValidation)
  • Leave P out cross validation added (LpoCrossValidation)
  • DataTrainTestSplitter was removed

0.9.0 #

  • copy, fill methods were added to Vector

0.8.0 #

  • Reflection was removed for all cases (Vector instantiation, Optimizer instantiation)

0.7.0 #

  • Abstract Vector-class was added as a base for typed and regular vector classes

0.6.0 #

  • Manhattan norm support was added

0.5.2 #

  • README file was extended and clarified

0.5.1 #

  • Random interval obtaining for the mini-batch gradient descent was fixed

0.5.0 #

  • BGDOptimizer, MBGDOptimizer and GradientOptimizer were added

0.4.0 #

  • OptimizerInterface was added
  • Stochastic gradient descent optimizer was extracted from the linear regressor class
  • Line separators changed for all files (CRLF -> LF)

0.3.1 #

  • tests for sum, abs, fromRange methods of the TypedVector were added
  • tests for DataTrainTestSplitter was added

0.3.0 #

  • MAPE cost function was added

0.2.0 #

  • SGD Regressor refactored (rmse on training removed, estimator added) + example extended

0.1.0 #

  • Implementation of -, *, / operators and all vectors methods added to the TypedVector

0.0.1 #

  • Initial version

example/main.dart

import 'dart:async';

import 'package:ml_algo/ml_algo.dart';
import 'package:ml_linalg/matrix.dart';
import 'package:ml_linalg/vector.dart';

/// A simple usage example using synthetic data. To see more complex examples, please, visit other directories in this
/// folder
Future main() async {
  // Let's create a feature matrix (a set of independent variables)
  final features = MLMatrix.from([
    [2.0, 3.0, 4.0, 5.0],
    [12.0, 32.0, 1.0, 3.0],
    [27.0, 3.0, 0.0, 59.0],
  ]);

  // Let's create dependent variables vector. It will be used as `true` values to adjust regression coefficients
  final labels = MLVector.from([4.3, 3.5, 2.1]);

  // Let's create a regressor itself. With its help we can train some linear model to predict a label value for a new
  // features
  final regressor = LinearRegressor.gradient(
      iterationsLimit: 100,
      initialLearningRate: 0.0005,
      learningRateType: LearningRateType.constant);

  // Let's train our model (training or fitting is a coefficients adjusting process)
  regressor.fit(features, labels);

  // Let's see adjusted coefficients
  print('Regression coefficients: ${regressor.weights}');
}

Use this package as a library

1. Depend on it

Add this to your package's pubspec.yaml file:


dependencies:
  ml_algo: ^7.2.0

2. Install it

You can install packages from the command line:

with pub:


$ pub get

with Flutter:


$ flutter packages get

Alternatively, your editor might support pub get or flutter packages get. Check the docs for your editor to learn more.

3. Import it

Now in your Dart code, you can use:


import 'package:ml_algo/ml_algo.dart';
  
Version Uploaded Documentation Archive
7.2.0 Feb 18, 2019 Go to the documentation of ml_algo 7.2.0 Download ml_algo 7.2.0 archive
7.1.0 Feb 16, 2019 Go to the documentation of ml_algo 7.1.0 Download ml_algo 7.1.0 archive
7.0.0 Feb 14, 2019 Go to the documentation of ml_algo 7.0.0 Download ml_algo 7.0.0 archive
6.2.0 Feb 14, 2019 Go to the documentation of ml_algo 6.2.0 Download ml_algo 6.2.0 archive
6.1.0 Feb 11, 2019 Go to the documentation of ml_algo 6.1.0 Download ml_algo 6.1.0 archive
6.0.6 Feb 10, 2019 Go to the documentation of ml_algo 6.0.6 Download ml_algo 6.0.6 archive
6.0.5 Feb 10, 2019 Go to the documentation of ml_algo 6.0.5 Download ml_algo 6.0.5 archive
6.0.4 Feb 10, 2019 Go to the documentation of ml_algo 6.0.4 Download ml_algo 6.0.4 archive
6.0.3 Feb 10, 2019 Go to the documentation of ml_algo 6.0.3 Download ml_algo 6.0.3 archive
6.0.2 Feb 10, 2019 Go to the documentation of ml_algo 6.0.2 Download ml_algo 6.0.2 archive

All 30 versions...

Popularity:
Describes how popular the package is relative to other packages. [more]
30
Health:
Code health derived from static analysis. [more]
100
Maintenance:
Reflects how tidy and up-to-date the package is. [more]
100
Overall:
Weighted score of the above. [more]
65
Learn more about scoring.

We analyzed this package on Feb 20, 2019, and provided a score, details, and suggestions below. Analysis was completed with status completed using:

  • Dart: 2.1.0
  • pana: 0.12.13+1

Platforms

Detected platforms: Flutter, other

Primary library: package:ml_algo/ml_algo.dart with components: io.

Health suggestions

Format lib/src/score_to_prob_mapper/float32x4_softmax_mapper_mixin.dart.

Run dartfmt to format lib/src/score_to_prob_mapper/float32x4_softmax_mapper_mixin.dart.

Format lib/src/score_to_prob_mapper/softmax_mapper.dart.

Run dartfmt to format lib/src/score_to_prob_mapper/softmax_mapper.dart.

Dependencies

Package Constraint Resolved Available
Direct dependencies
Dart SDK >=2.0.0 <3.0.0
csv ^4.0.0 4.0.1
logging ^0.11.3+2 0.11.3+2
ml_linalg ^5.1.0 5.3.0
tuple ^1.0.2 1.0.2
Transitive dependencies
matcher 0.12.4
meta 1.1.7
path 1.6.2
quiver 2.0.1
stack_trace 1.9.3
Dev dependencies
benchmark_harness >=1.0.0 <2.0.0
build_runner ^1.1.2
build_test ^0.10.2
mockito ^3.0.0
pedantic 1.1.0
test ^1.2.0