From f2286b9a756a0fc49a211a262887bf73c05c4609 Mon Sep 17 00:00:00 2001 From: Herman Slatman Date: Fri, 13 May 2016 09:03:27 +0200 Subject: [PATCH 01/51] DSSTNE by Amazon added --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index aa2ddfc..ef3ce97 100644 --- a/README.md +++ b/README.md @@ -157,6 +157,7 @@ For a list of free machine learning books available for download, go [here](http * [CNTK](https://github.com/Microsoft/CNTK) - The Computational Network Toolkit (CNTK) by Microsoft Research, is a unified deep-learning toolkit that describes neural networks as a series of computational steps via a directed graph. * [DeepDetect](https://github.com/beniz/deepdetect) - A machine learning API and server written in C++11. It makes state of the art machine learning easy to work with and integrate into existing applications. * [Fido](https://github.com/FidoProject/Fido) - A highly-modular C++ machine learning library for embedded electronics and robotics. +* [DSSTNE](https://github.com/amznlabs/amazon-dsstne) - A software library created by Amazon for training and deploying deep neural networks using GPUs which emphasizes speed and scale over experimental flexibility. #### Natural Language Processing From 54bce13c9a1c699e5b88f9946a5496052390c78f Mon Sep 17 00:00:00 2001 From: Vincent Botta Date: Fri, 13 May 2016 11:03:37 +0200 Subject: [PATCH 02/51] Add Ruffus pipeline for data analysis --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index aa2ddfc..fdda503 100644 --- a/README.md +++ b/README.md @@ -801,6 +801,7 @@ on MNIST digits[DEEP LEARNING] * [pastalog](https://github.com/rewonc/pastalog) - Simple, realtime visualization of neural network training performance. * [caravel](https://github.com/airbnb/caravel) - A data exploration platform designed to be visual, intuitive, and interactive. * [Dora](https://github.com/nathanepstein/dora) - Tools for exploratory data analysis in Python. +* [Ruffus](http://www.ruffus.org.uk) - Computation Pipeline library for python. #### Misc Scripts / iPython Notebooks / Codebases From bc9a5ba533e891076ac6503faff6fd16761ba031 Mon Sep 17 00:00:00 2001 From: Kashyap Raval Date: Wed, 18 May 2016 08:01:41 +0530 Subject: [PATCH 03/51] Update README.md --- README.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/README.md b/README.md index 035ba6a..8e06c63 100644 --- a/README.md +++ b/README.md @@ -79,6 +79,7 @@ For a list of free machine learning books available for download, go [here](http - [Data Analysis / Data Visualization](#python-data-analysis) - [Misc Scripts / iPython Notebooks / Codebases](#python-misc) - [Kaggle Competition Source Code](#python-kaggle) + - [Neural networks](#python-neural networks) - [Ruby](#ruby) - [Natural Language Processing](#ruby-nlp) - [General-Purpose Machine Learning](#ruby-general-purpose) @@ -840,9 +841,16 @@ on MNIST digits[DEEP LEARNING] * [TDB](https://github.com/ericjang/tdb) - TensorDebugger (TDB) is a visual debugger for deep learning. It features interactive, node-by-node debugging and visualization for TensorFlow. + +#### Neural networks +* [Neural networks](https://github.com/karpathy/neuraltalk) - NeuralTalk is a Python+numpy project for learning Multimodal Recurrent Neural Networks that describe images with sentences. + + #### Kaggle Competition Source Code + + * [wiki challenge](https://github.com/hammer/wikichallenge) - An implementation of Dell Zhang's solution to Wikipedia's Participation Challenge on Kaggle * [kaggle insults](https://github.com/amueller/kaggle_insults) - Kaggle Submission for "Detecting Insults in Social Commentary" * [kaggle_acquire-valued-shoppers-challenge](https://github.com/MLWave/kaggle_acquire-valued-shoppers-challenge) - Code for the Kaggle acquire valued shoppers challenge From 1475a2e4c427e9649a4984858fba2c1328501d94 Mon Sep 17 00:00:00 2001 From: Arash Rouhani Date: Wed, 25 May 2016 11:39:56 +0700 Subject: [PATCH 04/51] Java: Remove old projects Eva: Link is broken JAVA-ML: Not updated for ages JSAT: link broken --- README.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/README.md b/README.md index 8e06c63..cae88b7 100644 --- a/README.md +++ b/README.md @@ -306,13 +306,10 @@ For a list of free machine learning books available for download, go [here](http * [Datumbox](https://github.com/datumbox/datumbox-framework) - Machine Learning framework for rapid development of Machine Learning and Statistical applications * [ELKI](http://elki.dbs.ifi.lmu.de/) - Java toolkit for data mining. (unsupervised: clustering, outlier detection etc.) * [Encog](https://github.com/encog/encog-java-core) - An advanced neural network and machine learning framework. Encog contains classes to create a wide variety of networks, as well as support classes to normalize and process data for these neural networks. Encog trains using multithreaded resilient propagation. Encog can also make use of a GPU to further speed processing time. A GUI based workbench is also provided to help model and train neural networks. -* [EvA2](www.ra.cs.uni-tuebingen.de/software/eva2/) - Evolutionary Algorithms Framework with Genetic Algorithm, Differential Evolution, Particle Swarm Optimization, Evolution Strategies, Covariance Matrix Adaptation Evolution Strategy, and more * [FlinkML in Apache Flink](https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/libs/ml/index.html) - Distributed machine learning library in Flink * [H2O](https://github.com/h2oai/h2o-3) - ML engine that supports distributed learning on Hadoop, Spark or your laptop via APIs in R, Python, Scala, REST/JSON. * [htm.java](https://github.com/numenta/htm.java) - General Machine Learning library using Numenta’s Cortical Learning Algorithm * [java-deeplearning](https://github.com/deeplearning4j/deeplearning4j) - Distributed Deep Learning Platform for Java, Clojure,Scala -* [JAVA-ML](http://java-ml.sourceforge.net/) - A general ML library with a common interface for all algorithms in Java -* [JSAT](https://code.google.com/p/java-statistical-analysis-tool/) - Numerous Machine Learning algorithms for classification, regression, and clustering. * [Mahout](https://github.com/apache/mahout) - Distributed machine learning * [Meka](http://meka.sourceforge.net/) - An open source implementation of methods for multi-label classification and evaluation (extension to Weka). * [MLlib in Apache Spark](http://spark.apache.org/docs/latest/mllib-guide.html) - Distributed machine learning library in Spark From 30ab3ea5c3d941a231f577758a0b9f0acd9b9fbd Mon Sep 17 00:00:00 2001 From: Matthew Cunningham Date: Wed, 25 May 2016 08:19:13 -0500 Subject: [PATCH 05/51] adding naive-apl --- README.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/README.md b/README.md index cae88b7..215f8db 100644 --- a/README.md +++ b/README.md @@ -14,6 +14,8 @@ For a list of free machine learning books available for download, go [here](http +- [APL](#apl) + - [General-Purpose Machine Learning](#apl-general-purpose) - [C](#c) - [General-Purpose Machine Learning](#c-general-purpose) - [Computer Vision](#c-cv) @@ -106,6 +108,11 @@ For a list of free machine learning books available for download, go [here](http + +## APL +#### General-Purpose Machine Learning +* [naive-apl](https://github.com/mattcunningham/naive-apl) - Naive Bayesian Classifier implementation in APL + ## C From b868f981b50ec8a496fa4a0bcb9494080c1f7506 Mon Sep 17 00:00:00 2001 From: Matthew Cunningham Date: Wed, 25 May 2016 08:21:15 -0500 Subject: [PATCH 06/51] adding toc link --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 215f8db..2682648 100644 --- a/README.md +++ b/README.md @@ -110,6 +110,8 @@ For a list of free machine learning books available for download, go [here](http ## APL + + #### General-Purpose Machine Learning * [naive-apl](https://github.com/mattcunningham/naive-apl) - Naive Bayesian Classifier implementation in APL From 3c42357b5c9a367f2248d1f03b0646c4fcc3a9a1 Mon Sep 17 00:00:00 2001 From: Dmitriy Volkov Date: Fri, 27 May 2016 00:28:22 +0300 Subject: [PATCH 07/51] Add REP --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 2682648..1de39af 100644 --- a/README.md +++ b/README.md @@ -768,6 +768,7 @@ on MNIST digits[DEEP LEARNING] * [MXNet](https://github.com/dmlc/mxnet) - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Go, Javascript and more. * [milk](https://github.com/luispedro/milk) - Machine learning toolkit focused on supervised classification. * [TFLearn](https://github.com/tflearn/tflearn) - Deep learning library featuring a higher-level API for TensorFlow. +* [REP](https://github.com/yandex/rep) - an IPython-based environment for conducting data-driven research in a consistent and reproducible way. REP is not trying to substitute scikit-learn, but extends it and provides better user experience. #### Data Analysis / Data Visualization From d2ab62151d75cfcc0893beb95aff6f239e51ac8b Mon Sep 17 00:00:00 2001 From: Nelson Liu Date: Sun, 29 May 2016 23:16:45 -0700 Subject: [PATCH 08/51] Fix grammar in deprecation terms --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 2682648..9db439f 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@ A curated list of awesome machine learning frameworks, libraries and software (by language). Inspired by awesome-php. If you want to contribute to this list (please do), send me a pull request or contact me [@josephmisiti](https://twitter.com/josephmisiti) -Also, when you noticed that listed repository should be deprecated. +Also, a listed repository should be deprecated if: * Repository's owner explicitly say that "this library is not maintained". * Not committed for long time (2~3 years). From 98088fa0eac83b41cd2578cc54d617bad35590ca Mon Sep 17 00:00:00 2001 From: Nelson Liu Date: Sun, 29 May 2016 23:20:10 -0700 Subject: [PATCH 09/51] Deprecate leaf Leaf, the machine learning framework for rust, is unfortunately no longer being maintained. see: https://medium.com/@mjhirn/tensorflow-wins-89b78b29aafb#.s0a3uy4cc --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 2682648..4a117c2 100644 --- a/README.md +++ b/README.md @@ -926,7 +926,7 @@ on MNIST digits[DEEP LEARNING] * [deeplearn-rs](https://github.com/tedsta/deeplearn-rs) - deeplearn-rs provides simple networks that use matrix multiplication, addition, and ReLU under the MIT license. * [rustlearn](https://github.com/maciejkula/rustlearn) - a machine learning framework featuring logistic regression, support vector machines, decision trees and random forests. * [rusty-machine](https://github.com/AtheMathmo/rusty-machine) - a pure-rust machine learning library. -* [leaf](https://github.com/autumnai/leaf) - open source framework for machine intelligence, sharing concepts from TensorFlow and Caffe. Available under the MIT license. +* [leaf](https://github.com/autumnai/leaf) - open source framework for machine intelligence, sharing concepts from TensorFlow and Caffe. Available under the MIT license. [**[Deprecated]**](https://medium.com/@mjhirn/tensorflow-wins-89b78b29aafb#.s0a3uy4cc) * [RustNN](https://github.com/jackm321/RustNN) - RustNN is a feedforward neural network library. From 1deadaac49015fa8fb867310e82743f9c06fc625 Mon Sep 17 00:00:00 2001 From: Daniel Khashabi Date: Mon, 30 May 2016 13:23:11 -0700 Subject: [PATCH 10/51] Update README.md --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index c055756..616c81d 100644 --- a/README.md +++ b/README.md @@ -307,6 +307,7 @@ For a list of free machine learning books available for download, go [here](http * [ClearTK](https://code.google.com/p/cleartk/) - ClearTK provides a framework for developing statistical natural language processing (NLP) components in Java and is built on top of Apache UIMA. * [Apache cTAKES](http://ctakes.apache.org/) - Apache clinical Text Analysis and Knowledge Extraction System (cTAKES) is an open-source natural language processing system for information extraction from electronic medical record clinical free-text. * [ClearNLP](http://www.clearnlp.com) - The ClearNLP project provides software and resources for natural language processing. The project started at the Center for Computational Language and EducAtion Research, and is currently developed by the Center for Language and Information Research at Emory University. This project is under the Apache 2 license. +* [CogcompNLP](https://github.com/IllinoisCogComp/illinois-cogcomp-nlp) - This project collects a number of core libraries for Natural Language Processing (NLP) developed in the University of Illinois' Cognitive Computation Group, for example `illinois-core-utilities` which provides a set of NLP-friendly data structures and a number of NLP-related utilities that support writing NLP applications, running experiments, etc, `illinois-edison` a library for feature extraction from illinois-core-utilities data structures and many other packages. #### General-Purpose Machine Learning @@ -333,6 +334,7 @@ For a list of free machine learning books available for download, go [here](http * [SystemML](https://github.com/apache/incubator-systemml) - flexible, scalable machine learning (ML) language. * [WalnutiQ](https://github.com/WalnutiQ/WalnutiQ) - object oriented model of the human brain * [Weka](http://www.cs.waikato.ac.nz/ml/weka/) - Weka is a collection of machine learning algorithms for data mining tasks +* [LBJava](https://github.com/IllinoisCogComp/lbjava/) - Learning Based Java is a modeling language for the rapid development of software systems, offers a convenient, declarative syntax for classifier and constraint definition directly in terms of the objects in the programmer's application. #### Speech Recognition * [CMU Sphinx](http://cmusphinx.sourceforge.net/) - Open Source Toolkit For Speech Recognition purely based on Java speech recognition library. From 09432f59866880631effa7ec66d8dc218c1cd897 Mon Sep 17 00:00:00 2001 From: deetruong89 Date: Tue, 31 May 2016 23:18:29 -0700 Subject: [PATCH 11/51] Update README.md --- README.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/README.md b/README.md index 616c81d..4f218c9 100644 --- a/README.md +++ b/README.md @@ -104,6 +104,8 @@ For a list of free machine learning books available for download, go [here](http - [General-Purpose Machine Learning](#scala-general-purpose) - [Swift](#swift) - [General-Purpose Machine Learning](#swift-general-purpose) +- [TensorFlow](#tensor) + - [General-Purpose Machine Learning](#tensor-general-purpose) - [Credits](#credits) @@ -1121,6 +1123,13 @@ on MNIST digits[DEEP LEARNING] It currently allows using deep convolutional neural network models trained in Caffe on Apple operating systems. * [AIToolbox](https://github.com/KevinCoble/AIToolbox) - A toolbox framework of AI modules written in Swift: Graphs/Trees, Linear Regression, Support Vector Machines, Neural Networks, PCA, KMeans, Genetic Algorithms, MDP, Mixture of Gaussians. + +## TensorFlow + + +#### General-Purpose Machine Learning +* [Awesome TensorFlow](https://github.com/jtoy/awesome-tensorflow) - A list of all things related to TensorFlow + ## Credits From 6afbc55f2348801162a2459cdcd38a778f7323d5 Mon Sep 17 00:00:00 2001 From: David van Leeuwen Date: Thu, 16 Jun 2016 13:49:20 +0200 Subject: [PATCH 12/51] Update README.md Added two selfies under Julia packages (ROCAnalysis and GaussianMixtures), and the ScikitLearn Julia interface to the python package. --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index 4f218c9..4bf5227 100644 --- a/README.md +++ b/README.md @@ -462,6 +462,9 @@ For a list of free machine learning books available for download, go [here](http * [ManifoldLearning](https://github.com/wildart/ManifoldLearning.jl) - A Julia package for manifold learning and nonlinear dimensionality reduction * [MXNet](https://github.com/dmlc/mxnet) - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Go, Javascript and more. * [Merlin](https://github.com/hshindo/Merlin.jl) - Flexible Deep Learning Framework in Julia +* [ROCAnalysis](https://github.com/davidavdav/ROCAnalysis.jl) - Receiver Operating Characteristics and functions for evaluation probabilistic binary classifiers +* [GaussianMixtures] (https://github.com/davidavdav/GaussianMixtures.jl) - Large scale Gaussian Mixture Models +* [ScikitLearn] (https://github.com/cstjean/ScikitLearn.jl) - Julia implementation of the scikit-learn API #### Natural Language Processing From b36d613841fcb2a7dabf9fe04affe7b50933d91a Mon Sep 17 00:00:00 2001 From: Vladimir Metnev Date: Sat, 18 Jun 2016 00:04:53 +0300 Subject: [PATCH 13/51] add Datamaps --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 4bf5227..f0e24f3 100644 --- a/README.md +++ b/README.md @@ -389,6 +389,7 @@ For a list of free machine learning books available for download, go [here](http * [Z3d](https://github.com/NathanEpstein/Z3d) - Easily make interactive 3d plots built on Three.js * [Sigma.js](http://sigmajs.org/) - JavaScript library dedicated to graph drawing. * [C3.js](http://c3js.org/)- customizable library based on D3.js for easy chart drawing. +* [Datamaps](http://datamaps.github.io/)- Customizable SVG map/geo visualizations using D3.js. * [ZingChart](http://www.zingchart.com/)- library written on Vanilla JS for big data visualization. * [cheminfo](http://www.cheminfo.org/) - Platform for data visualization and analysis, using the [visualizer](https://github.com/npellet/visualizer) project. From f0cfbc4cefd87285b6d871fe1ffc0ee1692f8bd4 Mon Sep 17 00:00:00 2001 From: Max Halford Date: Mon, 20 Jun 2016 13:44:39 +0200 Subject: [PATCH 14/51] Update README.md --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index f0e24f3..f4c43a6 100644 --- a/README.md +++ b/README.md @@ -254,6 +254,7 @@ For a list of free machine learning books available for download, go [here](http #### General-Purpose Machine Learning +* [gago](https://github.com/MaxHalford/gago) - Multi-population, flexible, parallel genetic algorithm. * [Go Learn](https://github.com/sjwhitworth/golearn) - Machine Learning for Go * [go-pr](https://github.com/daviddengcn/go-pr) - Pattern recognition package in Go lang. * [go-ml](https://github.com/alonsovidales/go_ml) - Linear / Logistic regression, Neural Networks, Collaborative Filtering and Gaussian Multivariate Distribution From 0c7cdc3ac07bdbcd5902c364da6e7799fd2f7f74 Mon Sep 17 00:00:00 2001 From: Eric Schles Date: Tue, 21 Jun 2016 09:37:16 -0400 Subject: [PATCH 15/51] Update README.md --- README.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/README.md b/README.md index f0e24f3..db0a52b 100644 --- a/README.md +++ b/README.md @@ -691,6 +691,7 @@ on MNIST digits[DEEP LEARNING] * [SimpleCV](http://simplecv.org/) - An open source computer vision framework that gives access to several high-powered computer vision libraries, such as OpenCV. Written on Python and runs on Mac, Windows, and Ubuntu Linux. * [Vigranumpy](https://github.com/ukoethe/vigra) - Python bindings for the VIGRA C++ computer vision library. * [OpenFace](https://cmusatyalab.github.io/openface/) - Free and open source face recognition with deep neural networks. +* [PCV](https://github.com/jesolem/PCV) - Open source Python module for computer vision #### Natural Language Processing @@ -716,6 +717,11 @@ on MNIST digits[DEEP LEARNING] * [colibri-core](https://github.com/proycon/colibri-core) - Python binding to C++ library for extracting and working with with basic linguistic constructions such as n-grams and skipgrams in a quick and memory-efficient way. * [spaCy](https://github.com/honnibal/spaCy/) - Industrial strength NLP with Python and Cython. * [PyStanfordDependencies](https://github.com/dmcc/PyStanfordDependencies) - Python interface for converting Penn Treebank trees to Stanford Dependencies. +* [Distance](https://github.com/doukremt/distance) - Levenshtein and Hamming distance computation +* [Fuzzy Wuzzy](https://github.com/seatgeek/fuzzywuzzy) - Fuzzy String Matching in Python +* [jellyfish](https://github.com/jamesturk/jellyfishå) - a python library for doing approximate and phonetic matching of strings. +* [editdistance](https://pypi.python.org/pypi/editdistance) - fast implementation of edit distance +* [textacy](https://github.com/chartbeat-labs/textacy) - higher-level NLP built on Spacy #### General-Purpose Machine Learning From ce2f7ced0a8331917d3d42241a575598b505bdab Mon Sep 17 00:00:00 2001 From: Eric Schles Date: Tue, 21 Jun 2016 09:41:45 -0400 Subject: [PATCH 16/51] Update README.md --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index db0a52b..fd32c05 100644 --- a/README.md +++ b/README.md @@ -825,6 +825,8 @@ on MNIST digits[DEEP LEARNING] * [caravel](https://github.com/airbnb/caravel) - A data exploration platform designed to be visual, intuitive, and interactive. * [Dora](https://github.com/nathanepstein/dora) - Tools for exploratory data analysis in Python. * [Ruffus](http://www.ruffus.org.uk) - Computation Pipeline library for python. +* [SOMPY](https://github.com/sevamoo/SOMPY) - Self Organizing Map written in Python (Uses neural networks for data analysis). +* [HDBScan](https://github.com/lmcinnes/hdbscan) - implementation of the hdbscan algorithm in Python - used for clustering #### Misc Scripts / iPython Notebooks / Codebases From 17e20bcf0c918ea3da1189b19f7d6a7a8feb8afc Mon Sep 17 00:00:00 2001 From: Ajay Jain Date: Wed, 22 Jun 2016 10:55:43 -0700 Subject: [PATCH 17/51] Fix error in Jellyfish link --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index e109d3c..5cc2b87 100644 --- a/README.md +++ b/README.md @@ -720,7 +720,7 @@ on MNIST digits[DEEP LEARNING] * [PyStanfordDependencies](https://github.com/dmcc/PyStanfordDependencies) - Python interface for converting Penn Treebank trees to Stanford Dependencies. * [Distance](https://github.com/doukremt/distance) - Levenshtein and Hamming distance computation * [Fuzzy Wuzzy](https://github.com/seatgeek/fuzzywuzzy) - Fuzzy String Matching in Python -* [jellyfish](https://github.com/jamesturk/jellyfishå) - a python library for doing approximate and phonetic matching of strings. +* [jellyfish](https://github.com/jamesturk/jellyfish) - a python library for doing approximate and phonetic matching of strings. * [editdistance](https://pypi.python.org/pypi/editdistance) - fast implementation of edit distance * [textacy](https://github.com/chartbeat-labs/textacy) - higher-level NLP built on Spacy From d1eb3105cea041945de7767baa378150ef0b0ddf Mon Sep 17 00:00:00 2001 From: Daniel Khashabi Date: Wed, 22 Jun 2016 13:20:04 -0700 Subject: [PATCH 18/51] Update README.md --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 5cc2b87..4d4fad7 100644 --- a/README.md +++ b/README.md @@ -339,6 +339,7 @@ For a list of free machine learning books available for download, go [here](http * [Weka](http://www.cs.waikato.ac.nz/ml/weka/) - Weka is a collection of machine learning algorithms for data mining tasks * [LBJava](https://github.com/IllinoisCogComp/lbjava/) - Learning Based Java is a modeling language for the rapid development of software systems, offers a convenient, declarative syntax for classifier and constraint definition directly in terms of the objects in the programmer's application. + #### Speech Recognition * [CMU Sphinx](http://cmusphinx.sourceforge.net/) - Open Source Toolkit For Speech Recognition purely based on Java speech recognition library. @@ -1121,6 +1122,7 @@ on MNIST digits[DEEP LEARNING] * [H2O Sparkling Water](https://github.com/h2oai/sparkling-water) - H2O and Spark interoperability. * [FlinkML in Apache Flink](https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/libs/ml/index.html) - Distributed machine learning library in Flink * [DynaML](https://github.com/mandar2812/DynaML) - Scala Library/REPL for Machine Learning Research +* [Saul](https://github.com/IllinoisCogComp/saul/) - Flexible Declarative Learning-Based Programming. ## Swift From e67f0ae9ed04851762f8135a5b9b5ee454ba111e Mon Sep 17 00:00:00 2001 From: tbds Date: Fri, 24 Jun 2016 12:19:55 +0100 Subject: [PATCH 19/51] Add Torch-based deep learning library torchnet --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 4d4fad7..9032192 100644 --- a/README.md +++ b/README.md @@ -515,6 +515,7 @@ For a list of free machine learning books available for download, go [here](http * [randomkit](http://jucor.github.io/torch-randomkit/) - Numpy's randomkit, wrapped for Torch * [signal](http://soumith.ch/torch-signal/signal/) - A signal processing toolbox for Torch-7. FFT, DCT, Hilbert, cepstrums, stft * [nn](https://github.com/torch/nn) - Neural Network package for Torch + * [torchnet](https://github.com/torchnet/torchnet) - framework for torch which provides a set of abstractions aiming at encouraging code re-use as well as encouraging modular programming * [nngraph](https://github.com/torch/nngraph) - This package provides graphical computation for nn library in Torch7. * [nnx](https://github.com/clementfarabet/lua---nnx) - A completely unstable and experimental package that extends Torch's builtin nn library * [rnn](https://github.com/Element-Research/rnn) - A Recurrent Neural Network library that extends Torch's nn. RNNs, LSTMs, GRUs, BRNNs, BLSTMs, etc. From 172f6e7b07b01972ed18a03865410be0189e2d7f Mon Sep 17 00:00:00 2001 From: Kevin Markham Date: Fri, 1 Jul 2016 12:31:02 -0400 Subject: [PATCH 20/51] link to IPython notebooks from scikit-learn video series --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 9032192..6419d47 100644 --- a/README.md +++ b/README.md @@ -865,6 +865,7 @@ on MNIST digits[DEEP LEARNING] * [Optunity examples](http://docs.optunity.net/notebooks/index.html) - Examples demonstrating how to use Optunity in synergy with machine learning libraries. * [Dive into Machine Learning with Python Jupyter notebook and scikit-learn](https://github.com/hangtwenty/dive-into-machine-learning) - "I learned Python by hacking first, and getting serious *later.* I wanted to do this with Machine Learning. If this is your style, join me in getting a bit ahead of yourself." * [TDB](https://github.com/ericjang/tdb) - TensorDebugger (TDB) is a visual debugger for deep learning. It features interactive, node-by-node debugging and visualization for TensorFlow. +* [Introduction to machine learning with scikit-learn](https://github.com/justmarkham/scikit-learn-videos) - IPython notebooks from Data School's video tutorials on scikit-learn. From f8454ab1d5d8e4764e36853d5c21502e24e4baea Mon Sep 17 00:00:00 2001 From: Kevin Markham Date: Fri, 1 Jul 2016 12:34:52 -0400 Subject: [PATCH 21/51] add link on blogs page to Data School --- blogs.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/blogs.md b/blogs.md index d6b84c1..d188cc1 100644 --- a/blogs.md +++ b/blogs.md @@ -91,6 +91,8 @@ http://www.randalolson.com/blog/ http://www.johndcook.com/blog/r_language_for_programmers/ +http://www.dataschool.io/ + Math ---- From 6cea0093074a6c15a37f90234fc1a75172f888ca Mon Sep 17 00:00:00 2001 From: Nickil Maveli Date: Sun, 10 Jul 2016 01:42:31 +0530 Subject: [PATCH 22/51] Add Lasagne Neural network library --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 6419d47..39335c5 100644 --- a/README.md +++ b/README.md @@ -743,6 +743,7 @@ on MNIST digits[DEEP LEARNING] * [NuPIC](https://github.com/numenta/nupic) - Numenta Platform for Intelligent Computing. * [Pylearn2](https://github.com/lisa-lab/pylearn2) - A Machine Learning library based on [Theano](https://github.com/Theano/Theano). * [keras](https://github.com/fchollet/keras) - Modular neural network library based on [Theano](https://github.com/Theano/Theano). +* [Lasagne](https://github.com/Lasagne/Lasagne) - Lightweight library to build and train neural networks in Theano. * [hebel](https://github.com/hannes-brt/hebel) - GPU-Accelerated Deep Learning Library in Python. * [Chainer](https://github.com/pfnet/chainer) - Flexible neural network framework * [gensim](https://github.com/piskvorky/gensim) - Topic Modelling for Humans. From e310672347c32da2b1c3d3374fc67ca0d9e87712 Mon Sep 17 00:00:00 2001 From: Guled Date: Sat, 16 Jul 2016 15:22:40 -0500 Subject: [PATCH 23/51] Added MLKit Framework to Swift Machine Learning Category --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 39335c5..901a48a 100644 --- a/README.md +++ b/README.md @@ -1140,6 +1140,7 @@ on MNIST digits[DEEP LEARNING] * [DeepLearningKit](http://deeplearningkit.org/) an Open Source Deep Learning Framework for Apple’s iOS, OS X and tvOS. It currently allows using deep convolutional neural network models trained in Caffe on Apple operating systems. * [AIToolbox](https://github.com/KevinCoble/AIToolbox) - A toolbox framework of AI modules written in Swift: Graphs/Trees, Linear Regression, Support Vector Machines, Neural Networks, PCA, KMeans, Genetic Algorithms, MDP, Mixture of Gaussians. +* [MLKit](https://github.com/Somnibyte/MLKit) - A simple Machine Learning Framework written in Swift. Currently features Simple Linear Regression, Polynomial Regression, and Ridge Regression. ## TensorFlow From 6785d71e0563d4d14e631baa75dfb2bfd5d7d668 Mon Sep 17 00:00:00 2001 From: Dayvid Victor Date: Tue, 19 Jul 2016 11:33:50 -0300 Subject: [PATCH 24/51] Add imbalanced-learn Added imbalanced-learn to Python / General-Purpose Machine Learning \[Awesome\] imbalanced-learn is part of [scikit-learn-contrib](https://github.com/scikit-learn-contrib), fully compatible with scikit-learn, and provides several techniques for handling imbalanced data, including under sampling, over sampling, and ensembling. On the (github page)[https://github.com/scikit-learn-contrib/imbalanced-learn] it has (to this date): - 359 Stargazers - 125 Forks - 30 Watchers --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 901a48a..7fb011d 100644 --- a/README.md +++ b/README.md @@ -757,6 +757,7 @@ on MNIST digits[DEEP LEARNING] * [Bolt](https://github.com/pprett/bolt) - Bolt Online Learning Toolbox * [CoverTree](https://github.com/patvarilly/CoverTree) - Python implementation of cover trees, near-drop-in replacement for scipy.spatial.kdtree * [nilearn](https://github.com/nilearn/nilearn) - Machine learning for NeuroImaging in Python +* [imbalanced-learn](http://contrib.scikit-learn.org/imbalanced-learn/) - Python module to perform under sampling and over sampling with various techniques. * [Shogun](https://github.com/shogun-toolbox/shogun) - The Shogun Machine Learning Toolbox * [Pyevolve](https://github.com/perone/Pyevolve) - Genetic algorithm framework. * [Caffe](http://caffe.berkeleyvision.org) - A deep learning framework developed with cleanliness, readability, and speed in mind. From c09a4735f9515f62be6ebdf1be8298c1824779b0 Mon Sep 17 00:00:00 2001 From: David Shaub Date: Fri, 29 Jul 2016 08:37:38 -0600 Subject: [PATCH 25/51] add forecast and forecastHybrid --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 7fb011d..9b5e706 100644 --- a/README.md +++ b/README.md @@ -978,6 +978,8 @@ on MNIST digits[DEEP LEARNING] * [elasticnet](http://cran.r-project.org/web/packages/elasticnet/index.html) - elasticnet: Elastic-Net for Sparse Estimation and Sparse PCA * [ElemStatLearn](http://cran.r-project.org/web/packages/ElemStatLearn/index.html) - ElemStatLearn: Data sets, functions and examples from the book: "The Elements of Statistical Learning, Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedman Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedman * [evtree](http://cran.r-project.org/web/packages/evtree/index.html) - evtree: Evolutionary Learning of Globally Optimal Trees +* [forecast](http://cran.r-project.org/web/packages/forecast/index.html) - forecast: Timeseries forecasting using ARIMA, ETS, STLM, TBATS, and neural network models +* [forecastHybrid](http://cran.r-project.org/web/packages/forecastHybrid/index.html) - forecastHybrid: Automatic ensemble and cross validation of ARIMA, ETS, STLM, TBATS, and neural network models from the "forecast" package * [fpc](http://cran.r-project.org/web/packages/fpc/index.html) - fpc: Flexible procedures for clustering * [frbs](http://cran.r-project.org/web/packages/frbs/index.html) - frbs: Fuzzy Rule-based Systems for Classification and Regression Tasks * [GAMBoost](http://cran.r-project.org/web/packages/GAMBoost/index.html) - GAMBoost: Generalized linear and additive models by likelihood based boosting From 7ed0881c4602306453da14b21c2e2e9160180825 Mon Sep 17 00:00:00 2001 From: Antonio De Luca Date: Sun, 31 Jul 2016 13:39:07 +0100 Subject: [PATCH 26/51] Added new General-Purpose Machine Learning JavaScript library: DN2A - Digital Neural Network Architecture. --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 7fb011d..9e0efd7 100644 --- a/README.md +++ b/README.md @@ -403,6 +403,7 @@ For a list of free machine learning books available for download, go [here](http * [Clusterfck](http://harthur.github.io/clusterfck/) - Agglomerative hierarchical clustering implemented in Javascript for Node.js and the browser * [Clustering.js](https://github.com/emilbayes/clustering.js) - Clustering algorithms implemented in Javascript for Node.js and the browser * [Decision Trees](https://github.com/serendipious/nodejs-decision-tree-id3) - NodeJS Implementation of Decision Tree using ID3 Algorithm +* [DN2A](https://github.com/dn2a/dn2a-javascript) - Digital Neural Networks Architecture * [figue](http://code.google.com/p/figue/) - K-means, fuzzy c-means and agglomerative clustering * [Node-fann](https://github.com/rlidwka/node-fann) - FANN (Fast Artificial Neural Network Library) bindings for Node.js * [Kmeans.js](https://github.com/emilbayes/kMeans.js) - Simple Javascript implementation of the k-means algorithm, for node.js and the browser From c5ded6ac027faba147a035aa4c6cc12ea0b4c61c Mon Sep 17 00:00:00 2001 From: Fred Wu Date: Fri, 5 Aug 2016 15:24:12 +1000 Subject: [PATCH 27/51] Added Elixir libs --- README.md | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 1b8d673..a97c731 100644 --- a/README.md +++ b/README.md @@ -31,6 +31,9 @@ For a list of free machine learning books available for download, go [here](http - [Natural Language Processing](#clojure-nlp) - [General-Purpose Machine Learning](#clojure-general-purpose) - [Data Analysis / Data Visualization](#clojure-data-analysis) +- [Elixir](#elixir) + - [General-Purpose Machine Learning](#elixir-general-purpose) + - [Natural Language Processing](#elixir-nlp) - [Erlang](#erlang) - [General-Purpose Machine Learning](#erlang-general-purpose) - [Go](#go) @@ -92,7 +95,7 @@ For a list of free machine learning books available for download, go [here](http - [R](#r) - [General-Purpose Machine Learning](#r-general-purpose) - [Data Analysis / Data Visualization](#r-data-analysis) -- [SAS] (#sas) +- [SAS](#sas) - [General-Purpose Machine Learning] (#sas-general-purpose) - [Data Analysis / Data Visualization] (#sas-data-analysis) - [High Performance Machine Learning (MPP)] (#sas-mpp) @@ -233,11 +236,25 @@ For a list of free machine learning books available for download, go [here](http * [PigPen](https://github.com/Netflix/PigPen) - Map-Reduce for Clojure. * [Envision](https://github.com/clojurewerkz/envision) - Clojure Data Visualisation library, based on Statistiker and D3 + +## Elixir + + +#### General-Purpose Machine Learning + +* [Simple Bayes](https://github.com/fredwu/simple_bayes) - A Simple Bayes / Naive Bayes implementation in Elixir. + + +#### Natural Language Processing + +* [Stemmer](https://github.com/fredwu/stemmer) - An English (Porter2) stemming implementation in Elixir. + ## Erlang #### General-Purpose Machine Learning + * [Disco](https://github.com/discoproject/disco/) - Map Reduce in Erlang @@ -337,7 +354,7 @@ For a list of free machine learning books available for download, go [here](http * [SystemML](https://github.com/apache/incubator-systemml) - flexible, scalable machine learning (ML) language. * [WalnutiQ](https://github.com/WalnutiQ/WalnutiQ) - object oriented model of the human brain * [Weka](http://www.cs.waikato.ac.nz/ml/weka/) - Weka is a collection of machine learning algorithms for data mining tasks -* [LBJava](https://github.com/IllinoisCogComp/lbjava/) - Learning Based Java is a modeling language for the rapid development of software systems, offers a convenient, declarative syntax for classifier and constraint definition directly in terms of the objects in the programmer's application. +* [LBJava](https://github.com/IllinoisCogComp/lbjava/) - Learning Based Java is a modeling language for the rapid development of software systems, offers a convenient, declarative syntax for classifier and constraint definition directly in terms of the objects in the programmer's application. #### Speech Recognition @@ -1129,7 +1146,7 @@ on MNIST digits[DEEP LEARNING] * [H2O Sparkling Water](https://github.com/h2oai/sparkling-water) - H2O and Spark interoperability. * [FlinkML in Apache Flink](https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/libs/ml/index.html) - Distributed machine learning library in Flink * [DynaML](https://github.com/mandar2812/DynaML) - Scala Library/REPL for Machine Learning Research -* [Saul](https://github.com/IllinoisCogComp/saul/) - Flexible Declarative Learning-Based Programming. +* [Saul](https://github.com/IllinoisCogComp/saul/) - Flexible Declarative Learning-Based Programming. ## Swift @@ -1144,7 +1161,7 @@ on MNIST digits[DEEP LEARNING] * [DeepLearningKit](http://deeplearningkit.org/) an Open Source Deep Learning Framework for Apple’s iOS, OS X and tvOS. It currently allows using deep convolutional neural network models trained in Caffe on Apple operating systems. * [AIToolbox](https://github.com/KevinCoble/AIToolbox) - A toolbox framework of AI modules written in Swift: Graphs/Trees, Linear Regression, Support Vector Machines, Neural Networks, PCA, KMeans, Genetic Algorithms, MDP, Mixture of Gaussians. -* [MLKit](https://github.com/Somnibyte/MLKit) - A simple Machine Learning Framework written in Swift. Currently features Simple Linear Regression, Polynomial Regression, and Ridge Regression. +* [MLKit](https://github.com/Somnibyte/MLKit) - A simple Machine Learning Framework written in Swift. Currently features Simple Linear Regression, Polynomial Regression, and Ridge Regression. ## TensorFlow From 5ff7c04c34eaf91dff2625d0d795944c66fba0ba Mon Sep 17 00:00:00 2001 From: Ruslan Israfilov Date: Thu, 11 Aug 2016 22:31:59 +0300 Subject: [PATCH 28/51] Added reference to Intel(R) DAAL --- README.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 1b8d673..8ec1adc 100644 --- a/README.md +++ b/README.md @@ -170,6 +170,7 @@ For a list of free machine learning books available for download, go [here](http * [DeepDetect](https://github.com/beniz/deepdetect) - A machine learning API and server written in C++11. It makes state of the art machine learning easy to work with and integrate into existing applications. * [Fido](https://github.com/FidoProject/Fido) - A highly-modular C++ machine learning library for embedded electronics and robotics. * [DSSTNE](https://github.com/amznlabs/amazon-dsstne) - A software library created by Amazon for training and deploying deep neural networks using GPUs which emphasizes speed and scale over experimental flexibility. +* [Intel(R) DAAL](https://github.com/01org/daal) - A high performance software library developed by Intel and optimized for Intel's architectures. Library provides algorithmic building blocks for all stages of data analytics and allows to process data in batch, online and distributed modes. #### Natural Language Processing @@ -337,7 +338,7 @@ For a list of free machine learning books available for download, go [here](http * [SystemML](https://github.com/apache/incubator-systemml) - flexible, scalable machine learning (ML) language. * [WalnutiQ](https://github.com/WalnutiQ/WalnutiQ) - object oriented model of the human brain * [Weka](http://www.cs.waikato.ac.nz/ml/weka/) - Weka is a collection of machine learning algorithms for data mining tasks -* [LBJava](https://github.com/IllinoisCogComp/lbjava/) - Learning Based Java is a modeling language for the rapid development of software systems, offers a convenient, declarative syntax for classifier and constraint definition directly in terms of the objects in the programmer's application. +* [LBJava](https://github.com/IllinoisCogComp/lbjava/) - Learning Based Java is a modeling language for the rapid development of software systems, offers a convenient, declarative syntax for classifier and constraint definition directly in terms of the objects in the programmer's application. #### Speech Recognition @@ -1129,7 +1130,7 @@ on MNIST digits[DEEP LEARNING] * [H2O Sparkling Water](https://github.com/h2oai/sparkling-water) - H2O and Spark interoperability. * [FlinkML in Apache Flink](https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/libs/ml/index.html) - Distributed machine learning library in Flink * [DynaML](https://github.com/mandar2812/DynaML) - Scala Library/REPL for Machine Learning Research -* [Saul](https://github.com/IllinoisCogComp/saul/) - Flexible Declarative Learning-Based Programming. +* [Saul](https://github.com/IllinoisCogComp/saul/) - Flexible Declarative Learning-Based Programming. ## Swift @@ -1144,7 +1145,7 @@ on MNIST digits[DEEP LEARNING] * [DeepLearningKit](http://deeplearningkit.org/) an Open Source Deep Learning Framework for Apple’s iOS, OS X and tvOS. It currently allows using deep convolutional neural network models trained in Caffe on Apple operating systems. * [AIToolbox](https://github.com/KevinCoble/AIToolbox) - A toolbox framework of AI modules written in Swift: Graphs/Trees, Linear Regression, Support Vector Machines, Neural Networks, PCA, KMeans, Genetic Algorithms, MDP, Mixture of Gaussians. -* [MLKit](https://github.com/Somnibyte/MLKit) - A simple Machine Learning Framework written in Swift. Currently features Simple Linear Regression, Polynomial Regression, and Ridge Regression. +* [MLKit](https://github.com/Somnibyte/MLKit) - A simple Machine Learning Framework written in Swift. Currently features Simple Linear Regression, Polynomial Regression, and Ridge Regression. ## TensorFlow From 7c53482733828946abde8c755666cb09c8a7ee48 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Fran=C3=A7ois=20Maillet?= Date: Mon, 22 Aug 2016 14:21:11 -0400 Subject: [PATCH 29/51] Update README.md Added reference to MLDB --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 859116d..306feff 100644 --- a/README.md +++ b/README.md @@ -174,6 +174,7 @@ For a list of free machine learning books available for download, go [here](http * [Fido](https://github.com/FidoProject/Fido) - A highly-modular C++ machine learning library for embedded electronics and robotics. * [DSSTNE](https://github.com/amznlabs/amazon-dsstne) - A software library created by Amazon for training and deploying deep neural networks using GPUs which emphasizes speed and scale over experimental flexibility. * [Intel(R) DAAL](https://github.com/01org/daal) - A high performance software library developed by Intel and optimized for Intel's architectures. Library provides algorithmic building blocks for all stages of data analytics and allows to process data in batch, online and distributed modes. +* [MLDB](http://mldb.ai) - The Machine Learning Database is a database designed for machine learning. Send it commands over a RESTful API to store data, explore it using SQL, then train machine learning models and expose them as APIs. #### Natural Language Processing From 74b96a60ba2a96cf7968a81fdb8bfbd8c1b6f44d Mon Sep 17 00:00:00 2001 From: Valentyn Danylchuk Date: Tue, 23 Aug 2016 00:53:19 +0200 Subject: [PATCH 30/51] add SwiftLearner --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 306feff..8353b0e 100644 --- a/README.md +++ b/README.md @@ -1149,6 +1149,7 @@ on MNIST digits[DEEP LEARNING] * [FlinkML in Apache Flink](https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/libs/ml/index.html) - Distributed machine learning library in Flink * [DynaML](https://github.com/mandar2812/DynaML) - Scala Library/REPL for Machine Learning Research * [Saul](https://github.com/IllinoisCogComp/saul/) - Flexible Declarative Learning-Based Programming. +* [SwiftLearner](https://github.com/valdanylchuk/swiftlearner/) - Simply written algorithms to help study ML or write your own implementations. ## Swift From 92265fbe1099b7726081ce22af9f740701a04b6a Mon Sep 17 00:00:00 2001 From: Ivan Savov Date: Wed, 31 Aug 2016 11:34:32 -0400 Subject: [PATCH 31/51] link fix The PDF is also available here https://web.archive.org/web/20120417212823/http://faculty.ksu.edu.sa/69424/us_BOOk/Introduction%20to%20Applied%20Bayesian%20Statistics.pdf but citeseerx seems like the better approach... --- books.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/books.md b/books.md index 5cd6735..d63a16f 100644 --- a/books.md +++ b/books.md @@ -25,7 +25,7 @@ The following is a list of free, open source books on machine learning, statisti * [A Quest for AI](http://ai.stanford.edu/~nilsson/QAI/qai.pdf) * [Introduction to Applied Bayesian Statistics and Estimation for -Social Scientists](http://faculty.ksu.edu.sa/69424/us_BOOk/Introduction%20to%20Applied%20Bayesian%20Statistics.pdf) +Social Scientists](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.177.857&rep=rep1&type=pdf) * [Bayesian Modeling, Inference and Prediction](http://users.soe.ucsc.edu/~draper/draper-BMIP-dec2005.pdf) * [A Course in Machine Learning](http://ciml.info/) From 51ed82e0aa62067c1164a1e505676da6e6cc2621 Mon Sep 17 00:00:00 2001 From: Ivan Savov Date: Wed, 31 Aug 2016 11:38:44 -0400 Subject: [PATCH 32/51] whitesapce fix --- books.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/books.md b/books.md index d63a16f..2c3360d 100644 --- a/books.md +++ b/books.md @@ -23,9 +23,7 @@ The following is a list of free, open source books on machine learning, statisti * [Reinforcement Learning](http://www.intechopen.com/books/reinforcement_learning) * [Machine Learning](http://www.intechopen.com/books/machine_learning) * [A Quest for AI](http://ai.stanford.edu/~nilsson/QAI/qai.pdf) -* [Introduction to Applied Bayesian -Statistics and Estimation for -Social Scientists](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.177.857&rep=rep1&type=pdf) +* [Introduction to Applied Bayesian Statistics and Estimation for Social Scientists](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.177.857&rep=rep1&type=pdf) - Scott M. Lynch * [Bayesian Modeling, Inference and Prediction](http://users.soe.ucsc.edu/~draper/draper-BMIP-dec2005.pdf) * [A Course in Machine Learning](http://ciml.info/) From 12ce3dd323eb016339e52da881f8c1a0c887fe7b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Filip=20Mol=C4=8D=C3=ADk?= Date: Thu, 1 Sep 2016 08:52:10 +0200 Subject: [PATCH 33/51] Added new python library Neuron Added new python library Neuron for time series predictions --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 8353b0e..6697f85 100644 --- a/README.md +++ b/README.md @@ -893,6 +893,7 @@ on MNIST digits[DEEP LEARNING] #### Neural networks * [Neural networks](https://github.com/karpathy/neuraltalk) - NeuralTalk is a Python+numpy project for learning Multimodal Recurrent Neural Networks that describe images with sentences. +* [Neuron](https://github.com/molcik/Neuron) - Neuron is simple class for time series predictions. It's utilize LNU (Linear Neural Unit), QNU (Quadratic Neural Unit), RBF (Radial Basis Function), MLP (Multi Layer Perceptron), MLP-ELM (Multi Layer Perceptron - Extreme Learning Machine) neural networks learned with Gradient descent or LeLevenberg–Marquardt algorithm. From e68532cabb9b4d1961fc480e9b591724355462b6 Mon Sep 17 00:00:00 2001 From: NathanEpstein Date: Sat, 3 Sep 2016 15:00:18 -0400 Subject: [PATCH 34/51] add pydexter to python visualization --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 6697f85..eb0c189 100644 --- a/README.md +++ b/README.md @@ -830,6 +830,7 @@ on MNIST digits[DEEP LEARNING] * [plotly](https://plot.ly/python/) - Collaborative web plotting for Python and matplotlib. * [vincent](https://github.com/wrobstory/vincent) - A Python to Vega translator. * [d3py](https://github.com/mikedewar/d3py) - A plotting library for Python, based on [D3.js](http://d3js.org/). +* [PyDexter](https://github.com/D3xterjs/pydexter) - Simple plotting for Python. Wrapper for D3xterjs; easily render charts in-browser. * [ggplot](https://github.com/yhat/ggplot) - Same API as ggplot2 for R. * [ggfortify](https://github.com/sinhrks/ggfortify) - Unified interface to ggplot2 popular R packages. * [Kartograph.py](https://github.com/kartograph/kartograph.py) - Rendering beautiful SVG maps in Python. From e257a7cf15f7bed627802d96ef20218b804483c0 Mon Sep 17 00:00:00 2001 From: Norbert Date: Sun, 4 Sep 2016 09:30:21 +0200 Subject: [PATCH 35/51] Practical XGBoost in Python online course --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 6697f85..8987c34 100644 --- a/README.md +++ b/README.md @@ -888,7 +888,7 @@ on MNIST digits[DEEP LEARNING] * [Dive into Machine Learning with Python Jupyter notebook and scikit-learn](https://github.com/hangtwenty/dive-into-machine-learning) - "I learned Python by hacking first, and getting serious *later.* I wanted to do this with Machine Learning. If this is your style, join me in getting a bit ahead of yourself." * [TDB](https://github.com/ericjang/tdb) - TensorDebugger (TDB) is a visual debugger for deep learning. It features interactive, node-by-node debugging and visualization for TensorFlow. * [Introduction to machine learning with scikit-learn](https://github.com/justmarkham/scikit-learn-videos) - IPython notebooks from Data School's video tutorials on scikit-learn. - +* [Practical XGBoost in Python](http://education.parrotprediction.teachable.com/courses/practical-xgboost-in-python) - comprehensive online course about using XGBoost in Python #### Neural networks From 1b6f0979f631a80e87693175230df2b3142d511d Mon Sep 17 00:00:00 2001 From: Ilker Kesen Date: Tue, 6 Sep 2016 19:54:01 +0300 Subject: [PATCH 36/51] Add Knet.jl --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 1a53fca..3eb971c 100644 --- a/README.md +++ b/README.md @@ -488,6 +488,7 @@ For a list of free machine learning books available for download, go [here](http * [ROCAnalysis](https://github.com/davidavdav/ROCAnalysis.jl) - Receiver Operating Characteristics and functions for evaluation probabilistic binary classifiers * [GaussianMixtures] (https://github.com/davidavdav/GaussianMixtures.jl) - Large scale Gaussian Mixture Models * [ScikitLearn] (https://github.com/cstjean/ScikitLearn.jl) - Julia implementation of the scikit-learn API +* [Knet](https://github.com/denizyuret/Knet.jl) - Koç University Deep Learning Framework #### Natural Language Processing From 8c03257839ada328e7576f5eccf7f3aba9aa941a Mon Sep 17 00:00:00 2001 From: Rafael da Silva Ferreira Date: Wed, 21 Sep 2016 20:40:39 -0300 Subject: [PATCH 37/51] Add Swift Brain library The first neural network / machine learning library written in Swift. This is a project for AI algorithms in Swift for iOS and OS X development. This project includes algorithms focused on Bayes theorem, neural networks, SVMs, Matrices, etc.. --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 3eb971c..22e0ee1 100644 --- a/README.md +++ b/README.md @@ -1168,6 +1168,7 @@ on MNIST digits[DEEP LEARNING] It currently allows using deep convolutional neural network models trained in Caffe on Apple operating systems. * [AIToolbox](https://github.com/KevinCoble/AIToolbox) - A toolbox framework of AI modules written in Swift: Graphs/Trees, Linear Regression, Support Vector Machines, Neural Networks, PCA, KMeans, Genetic Algorithms, MDP, Mixture of Gaussians. * [MLKit](https://github.com/Somnibyte/MLKit) - A simple Machine Learning Framework written in Swift. Currently features Simple Linear Regression, Polynomial Regression, and Ridge Regression. +* [Swift Brain](https://github.com/vlall/Swift-Brain) - The first neural network / machine learning library written in Swift. This is a project for AI algorithms in Swift for iOS and OS X development. This project includes algorithms focused on Bayes theorem, neural networks, SVMs, Matrices, etc.. ## TensorFlow From 9a441b3de4719f5013cb975eaae88c56b19253ee Mon Sep 17 00:00:00 2001 From: Joseph Misiti Date: Fri, 23 Sep 2016 10:25:11 -0400 Subject: [PATCH 38/51] added real work ML --- books.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/books.md b/books.md index 2c3360d..0130cb7 100644 --- a/books.md +++ b/books.md @@ -2,6 +2,8 @@ The following is a list of free, open source books on machine learning, statisti ## Machine-Learning / Data Mining +* [Real World Machine Learning](https://manning.com/books/real-world-machine-learning) [Free Chapters] +* [Real World Machine Learning](https://manning.com/books/real-world-machine-learning) [Free Chapters] * [An Introduction To Statistical Learning](http://www-bcf.usc.edu/~gareth/ISL/) - Book + R Code * [Elements of Statistical Learning](http://statweb.stanford.edu/~tibs/ElemStatLearn/) - Book * [Probabilistic Programming & Bayesian Methods for Hackers](http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/) - Book + IPython Notebooks From 985629f7648f6737afe7433814d8d98fb99e1d16 Mon Sep 17 00:00:00 2001 From: Joseph Misiti Date: Fri, 23 Sep 2016 10:25:21 -0400 Subject: [PATCH 39/51] whoops --- books.md | 1 - 1 file changed, 1 deletion(-) diff --git a/books.md b/books.md index 0130cb7..915081b 100644 --- a/books.md +++ b/books.md @@ -2,7 +2,6 @@ The following is a list of free, open source books on machine learning, statisti ## Machine-Learning / Data Mining -* [Real World Machine Learning](https://manning.com/books/real-world-machine-learning) [Free Chapters] * [Real World Machine Learning](https://manning.com/books/real-world-machine-learning) [Free Chapters] * [An Introduction To Statistical Learning](http://www-bcf.usc.edu/~gareth/ISL/) - Book + R Code * [Elements of Statistical Learning](http://statweb.stanford.edu/~tibs/ElemStatLearn/) - Book From dcfaff7bc0e55c7c9a8964f39d5e952693f8173b Mon Sep 17 00:00:00 2001 From: fukatani Date: Sat, 24 Sep 2016 01:50:24 +0900 Subject: [PATCH 40/51] Add link to RGF. --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 22e0ee1..8428add 100644 --- a/README.md +++ b/README.md @@ -175,6 +175,7 @@ For a list of free machine learning books available for download, go [here](http * [DSSTNE](https://github.com/amznlabs/amazon-dsstne) - A software library created by Amazon for training and deploying deep neural networks using GPUs which emphasizes speed and scale over experimental flexibility. * [Intel(R) DAAL](https://github.com/01org/daal) - A high performance software library developed by Intel and optimized for Intel's architectures. Library provides algorithmic building blocks for all stages of data analytics and allows to process data in batch, online and distributed modes. * [MLDB](http://mldb.ai) - The Machine Learning Database is a database designed for machine learning. Send it commands over a RESTful API to store data, explore it using SQL, then train machine learning models and expose them as APIs. +* [Regularized Greedy Forest](http://stat.rutgers.edu/home/tzhang/software/rgf/) - Regularized greedy forest (RGF) tree ensemble learning method. #### Natural Language Processing @@ -809,6 +810,7 @@ on MNIST digits[DEEP LEARNING] * [milk](https://github.com/luispedro/milk) - Machine learning toolkit focused on supervised classification. * [TFLearn](https://github.com/tflearn/tflearn) - Deep learning library featuring a higher-level API for TensorFlow. * [REP](https://github.com/yandex/rep) - an IPython-based environment for conducting data-driven research in a consistent and reproducible way. REP is not trying to substitute scikit-learn, but extends it and provides better user experience. +* [rgf_python](https://github.com/fukatani/rgf_python) - Python bindings for Regularized Greedy Forest (Tree) Library. #### Data Analysis / Data Visualization From 068da47ddb38da118388e98e3880f152fb353bba Mon Sep 17 00:00:00 2001 From: Lanre Adebambo Date: Mon, 26 Sep 2016 05:37:23 -0600 Subject: [PATCH 41/51] Add stanford-corenlp-python Include stanford-corenlp-python too python NLP tools --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 8428add..7719288 100644 --- a/README.md +++ b/README.md @@ -747,6 +747,7 @@ on MNIST digits[DEEP LEARNING] * [jellyfish](https://github.com/jamesturk/jellyfish) - a python library for doing approximate and phonetic matching of strings. * [editdistance](https://pypi.python.org/pypi/editdistance) - fast implementation of edit distance * [textacy](https://github.com/chartbeat-labs/textacy) - higher-level NLP built on Spacy +* [stanford-corenlp-python](https://github.com/dasmith/stanford-corenlp-python) - Python wrapper for [Stanford CoreNLP](https://github.com/stanfordnlp/CoreNLP) #### General-Purpose Machine Learning From 51ba5594289eb5db9d646140acc35aa3ec26e388 Mon Sep 17 00:00:00 2001 From: Joseph Misiti Date: Sun, 2 Oct 2016 15:00:45 -0400 Subject: [PATCH 42/51] added https://jeremykun.com/ --- blogs.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/blogs.md b/blogs.md index d188cc1..87996d4 100644 --- a/blogs.md +++ b/blogs.md @@ -25,6 +25,8 @@ Podcasts Data Science / Statistics ------------------------- +https://jeremykun.com/ + http://iamtrask.github.io/ http://blog.explainmydata.com/ From 4497c1c1f0f713b3c9a6c5314f9f6b376f10860a Mon Sep 17 00:00:00 2001 From: Arthur Douillard Date: Sat, 8 Oct 2016 15:37:26 +0100 Subject: [PATCH 43/51] books: neural-network: add new book http://neuralnetworksanddeeplearning.com is a great book (html version unfortunately) on Neural network. --- books.md | 1 + 1 file changed, 1 insertion(+) diff --git a/books.md b/books.md index 915081b..38ba772 100644 --- a/books.md +++ b/books.md @@ -51,6 +51,7 @@ and Prediction](http://users.soe.ucsc.edu/~draper/draper-BMIP-dec2005.pdf) ## Neural Networks * [A Brief Introduction to Neural Networks](http://www.dkriesel.com/_media/science/neuronalenetze-en-zeta2-2col-dkrieselcom.pdf) +* [Neural Networks and Deep Learning](http://neuralnetworksanddeeplearning.com/) ## Probability & Statistics From c98532df91678c5f05f837433be303a89c97a213 Mon Sep 17 00:00:00 2001 From: Fabian Aussems Date: Sat, 15 Oct 2016 13:04:37 +0200 Subject: [PATCH 44/51] Added cortex --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 7719288..6aeeb00 100644 --- a/README.md +++ b/README.md @@ -231,6 +231,7 @@ For a list of free machine learning books available for download, go [here](http * [Statistiker](https://github.com/clojurewerkz/statistiker) - Basic Machine Learning algorithms in Clojure. * [clortex](https://github.com/nupic-community/clortex) - General Machine Learning library using Numenta’s Cortical Learning Algorithm * [comportex](https://github.com/nupic-community/comportex) - Functionally composable Machine Learning library using Numenta’s Cortical Learning Algorithm +* [cortex](https://github.com/thinktopic/cortex) - Neural networks, regression and feature learning in Clojure. #### Data Analysis / Data Visualization From 4e3ae0c0d34009784b5edf265af3f420d053b652 Mon Sep 17 00:00:00 2001 From: Traveloka Engineering Date: Sun, 16 Oct 2016 16:56:00 +0700 Subject: [PATCH 45/51] Add OpenAI Gym --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 6aeeb00..dbf4159 100644 --- a/README.md +++ b/README.md @@ -813,6 +813,7 @@ on MNIST digits[DEEP LEARNING] * [TFLearn](https://github.com/tflearn/tflearn) - Deep learning library featuring a higher-level API for TensorFlow. * [REP](https://github.com/yandex/rep) - an IPython-based environment for conducting data-driven research in a consistent and reproducible way. REP is not trying to substitute scikit-learn, but extends it and provides better user experience. * [rgf_python](https://github.com/fukatani/rgf_python) - Python bindings for Regularized Greedy Forest (Tree) Library. +* [gym](https://github.com/openai/gym) - OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms. #### Data Analysis / Data Visualization From 6d35d871bb0f00c720b90919d80895dfb2c9e7f3 Mon Sep 17 00:00:00 2001 From: gdouzas Date: Mon, 17 Oct 2016 19:26:51 +0300 Subject: [PATCH 46/51] Added shark description --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index dbf4159..f52ed70 100644 --- a/README.md +++ b/README.md @@ -155,7 +155,7 @@ For a list of free machine learning books available for download, go [here](http * [mlpack](http://www.mlpack.org/) - A scalable C++ machine learning library * [DLib](http://dlib.net/ml.html) - A suite of ML tools designed to be easy to imbed in other applications * [encog-cpp](https://code.google.com/p/encog-cpp/) -* [shark](http://image.diku.dk/shark/sphinx_pages/build/html/index.html) +* [shark](http://image.diku.dk/shark/sphinx_pages/build/html/index.html) - A fast, modular, feature-rich open-source C++ machine learning library. * [Vowpal Wabbit (VW)](https://github.com/JohnLangford/vowpal_wabbit/wiki) - A fast out-of-core learning system. * [sofia-ml](https://code.google.com/p/sofia-ml/) - Suite of fast incremental algorithms. * [Shogun](https://github.com/shogun-toolbox/shogun) - The Shogun Machine Learning Toolbox @@ -836,7 +836,7 @@ on MNIST digits[DEEP LEARNING] * [plotly](https://plot.ly/python/) - Collaborative web plotting for Python and matplotlib. * [vincent](https://github.com/wrobstory/vincent) - A Python to Vega translator. * [d3py](https://github.com/mikedewar/d3py) - A plotting library for Python, based on [D3.js](http://d3js.org/). -* [PyDexter](https://github.com/D3xterjs/pydexter) - Simple plotting for Python. Wrapper for D3xterjs; easily render charts in-browser. +* [PyDexter](https://github.com/D3xterjs/pydexter) - Simple plotting for Python. Wrapper for D3xterjs; easily render charts in-browser. * [ggplot](https://github.com/yhat/ggplot) - Same API as ggplot2 for R. * [ggfortify](https://github.com/sinhrks/ggfortify) - Unified interface to ggplot2 popular R packages. * [Kartograph.py](https://github.com/kartograph/kartograph.py) - Rendering beautiful SVG maps in Python. From 3f6b0adacc7e20e23465d5d04cba3cfd1de2b8d5 Mon Sep 17 00:00:00 2001 From: gdouzas Date: Mon, 17 Oct 2016 19:28:37 +0300 Subject: [PATCH 47/51] ROOT added --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index f52ed70..f86e366 100644 --- a/README.md +++ b/README.md @@ -152,6 +152,7 @@ For a list of free machine learning books available for download, go [here](http #### General-Purpose Machine Learning +* [ROOT](https://root.cern.ch) - A modular scientific software framework. It provides all the functionalities needed to deal with big data processing, statistical analysis, visualization and storage. * [mlpack](http://www.mlpack.org/) - A scalable C++ machine learning library * [DLib](http://dlib.net/ml.html) - A suite of ML tools designed to be easy to imbed in other applications * [encog-cpp](https://code.google.com/p/encog-cpp/) From b3ca9d4d4cb2481d3fd27ce7f891738b08d8102f Mon Sep 17 00:00:00 2001 From: Andrew Huy Nguyen Date: Mon, 17 Oct 2016 15:05:33 -0600 Subject: [PATCH 48/51] Added visualize_ML --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index f86e366..3344da2 100644 --- a/README.md +++ b/README.md @@ -860,6 +860,7 @@ on MNIST digits[DEEP LEARNING] * [Ruffus](http://www.ruffus.org.uk) - Computation Pipeline library for python. * [SOMPY](https://github.com/sevamoo/SOMPY) - Self Organizing Map written in Python (Uses neural networks for data analysis). * [HDBScan](https://github.com/lmcinnes/hdbscan) - implementation of the hdbscan algorithm in Python - used for clustering +* [visualize_ML](https://github.com/ayush1997/visualize_ML) - A python package for data exploration and data analysis. #### Misc Scripts / iPython Notebooks / Codebases From cbaa1aed183a5d1188f496b1b43313482a80065e Mon Sep 17 00:00:00 2001 From: Andrew Huy Nguyen Date: Wed, 19 Oct 2016 13:29:43 -0600 Subject: [PATCH 49/51] Added IDEAR and AMR --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 3344da2..ec6dca2 100644 --- a/README.md +++ b/README.md @@ -1075,6 +1075,7 @@ on MNIST digits[DEEP LEARNING] * [Optunity](http://docs.optunity.net) - A library dedicated to automated hyperparameter optimization with a simple, lightweight API to facilitate drop-in replacement of grid search. Optunity is written in Python but interfaces seamlessly to R. * [igraph](http://igraph.org/r/) - binding to igraph library - General purpose graph library * [MXNet](https://github.com/dmlc/mxnet) - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Go, Javascript and more. +* [TDSP-Utilities](https://github.com/Azure/Azure-TDSP-Utilities) - Two data science utilities in R from Microsoft: 1) Interactive Data Exploration, Analysis, and Reporting (IDEAR) ; 2) Automated Modeling and Reporting (AMR). #### Data Analysis / Data Visualization From a983dcc1b3cc79b1150fb58ef4319a8a0b9e91b2 Mon Sep 17 00:00:00 2001 From: Arkadiusz Kondas Date: Wed, 26 Oct 2016 19:35:37 +0200 Subject: [PATCH 50/51] Add link to php-ml --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index ec6dca2..434dbc5 100644 --- a/README.md +++ b/README.md @@ -706,6 +706,7 @@ on MNIST digits[DEEP LEARNING] ### General-Purpose Machine Learning +* [PHP-ML](https://github.com/php-ai/php-ml) - Machine Learning library for PHP. Algorithms, Cross Validation, Neural Network, Preprocessing, Feature Extraction and much more in one library. * [PredictionBuilder](https://github.com/denissimon/prediction-builder) - A library for machine learning that builds predictions using a linear regression. From 02128d0709f99427506e34b0913b63a337174f55 Mon Sep 17 00:00:00 2001 From: cloudkj Date: Wed, 26 Oct 2016 12:00:18 -0700 Subject: [PATCH 51/51] Add lambda-ml --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index ec6dca2..d06f373 100644 --- a/README.md +++ b/README.md @@ -233,6 +233,7 @@ For a list of free machine learning books available for download, go [here](http * [clortex](https://github.com/nupic-community/clortex) - General Machine Learning library using Numenta’s Cortical Learning Algorithm * [comportex](https://github.com/nupic-community/comportex) - Functionally composable Machine Learning library using Numenta’s Cortical Learning Algorithm * [cortex](https://github.com/thinktopic/cortex) - Neural networks, regression and feature learning in Clojure. +* [lambda-ml](https://github.com/cloudkj/lambda-ml) - Simple, concise implementations of machine learning techniques and utilities in Clojure. #### Data Analysis / Data Visualization