mirror of
https://github.com/josephmisiti/awesome-machine-learning.git
synced 2024-12-27 11:42:09 +03:00
A curated list of awesome Machine Learning frameworks, libraries and software.
README.md |
A curated list of awesome machine learning frameworks, libraries and software (by language). Inspired by awesome-php.
If you want to contribute to this list, send me a pull request or contact me @josephmisiti
Python
Natural Language Processing
- NLTK - A leading platform for building Python programs to work with human language data.
- Pattern - A web mining module for the Python programming language. It has tools for natural language processing, machine learning, among others.
- TextBlob - Providing a consistent API for diving into common natural language processing (NLP) tasks. Stands on the giant shoulders of NLTK and Pattern, and plays nicely with both.
- jieba - Chinese Words Segementation Utilities.
- SnowNLP - A library for processing Chinese text.
- loso - Another Chinese segmentation library.
- genius - A Chinese segment base on Conditional Random Field.
General-Purpose Machine Learning
- scikit-learn - A Python module for machine learning built on top of SciPy.
- pattern - Web mining module for Python.
- NuPIC - Numenta Platform for Intelligent Computing.
- Pylearn2 - A Machine Learning library based on Theano.
- hebel - GPU-Accelerated Deep Learning Library in Python.
- gensim - Topic Modelling for Humans.
- PyBrain - Another Python Machine Learning Library.
- Crab - A flexible, fast recommender engine.
- python-recsys - A Python library for implementing a Recommender System.
- BayesPy
Data Analysis / Data Visualization
- SciPy - A Python-based ecosystem of open-source software for mathematics, science, and engineering.
- NumPy - A fundamental package for scientific computing with Python.
- Numba - Python JIT (just in time) complier to LLVM aimed at scientific Python by the developers of Cython and NumPy.
- NetworkX - A high-productivity software for complex networks.
- Pandas - A library providing high-performance, easy-to-use data structures and data analysis tools.
- Open Mining - Business Intelligence (BI) in Python (Pandas web interface)
- PyMC - Markov Chain Monte Carlo sampling toolkit.
- zipline - A Pythonic algorithmic trading library.
- PyDy - Short for Python Dynamics, used to assist with workflow in the modeling of dynamic motion based around NumPy, SciPy, IPython, and matplotlib.
- SymPy - A Python library for symbolic mathematics.
- statsmodels - Statistical modeling and econometrics in Python.
- astropy - A community Python library for Astronomy.
- matplotlib - A Python 2D plotting library.
- bokeh - Interactive Web Plotting for Python.
- plotly - Collaborative web plotting for Python and matplotlib.
- vincent - A Python to Vega translator.
- d3py - A plottling library for Python, based on D3.js.
- ggplot - Same API as ggplot2 for R.
- Kartograph.py - Rendering beautiful SVG maps in Python.
- pygal - A Python SVG Charts Creator.
- pycascading
Misc Scripts / iPython Notebooks
- pattern_classification
- thinking stats 2
- hyperopt
- numpic
- 2012-paper-diginorm
- ipython-notebooks
- decision-weights
Ruby
Natural Language Processing
- Treat - Text REtrieval and Annotation Toolkit, definitely the most comprehensive toolkit I’ve encountered so far for Ruby
- Ruby Linguistics - NLTK for Ruby
- Stemmer
- Ruby Wordnet
- Raspel
- UEA Stemmer
- Twitter-text-rb
General-Purpose Machine Learning
Data Analysis / Data Visualization
- rsruby
- data-visualization-ruby
- ruby-plot
- plot-rb
- scruffy
- SciRuby
- Glean - A data management tool for humans
- Bioruby
- Arel
Misc
R
General-Purpose Machine Learning
Javascript
Natural Language Processing
Data Analysis / Data Visualization
General-Purpose Machine Learning
- Convnet.js [DEEP LEARNING]
- Clustering.js
- Decision Trees
- Node-fann
- Kmeans.js
- LDA.js
- Learning.js
- Machine Learning
- Node-SVM
- Brain
Scala
Natural Language Processing
- ScalaNLP - ScalaNLP is a suite of machine learning and numerical computing libraries.
- Breeze - Breeze is a numerical processing library for Scala.
- Chalk - Chalk is a natural language processing library.
- FACTORIE - FACTORIE is a toolkit for deployable probabilistic modeling, implemented as a software library in Scala. It provides its users with a succinct language for creating relational factor graphs, estimating parameters and performing inference.
Data Analysis / Data Visualization
General-Purpose Machine Learning
Java
Natural Language Processing
- [CoreNLP] (http://nlp.stanford.edu/software/corenlp.shtml)
- [Stanford Parser] (http://nlp.stanford.edu/software/lex-parser.shtml)
- [Stanford POS Tagger] (http://nlp.stanford.edu/software/tagger.shtml)
- [Stanford Name Entity Recognizer] (http://nlp.stanford.edu/software/CRF-NER.shtml)
- [Stanford Word Segmenter] (http://nlp.stanford.edu/software/segmenter.shtml)
- Tregex, Tsurgeon and Semgrex
- Stanford Phrasal: A Phrase-Based Translation System
- Stanford English Tokenizer
- Stanford Tokens Regex
- Stanford Temporal Tagger
- Stanford SPIED
- Stanford Topic Modeling Toolbox
- Twitter Text Java
General-Purpose Machine Learning
Data Analysis / Data Visualization
Go
Natural Language Processing
General-Purpose Machine Learning
Data Analysis / Data Visualization
Matlab
Natural Language Processing
General-Purpose Machine Learning
- Training a deep autoencoder or a classifier on MNIST digits [DEEP LEARNING]
- t-Distributed Stochastic Neighbor Embedding
- Spider
- LibSVM
- LibLinear
Data Analysis / Data Visualization
Julia
General-Purpose Machine Learning
- PGM
- DA
- Regression
- Local Regression
- Naive Bayes
- Mixed Models
- Simple MCMC
- Distance
- Decision Tree
- Neural
- MCMC
- GLM
- Online Learning
- GLMNet
- Clustering
- SVM
- Kernal Density
- Dimensionality Reduction
- NMF
Natural Language Processing
Data Analysis / Data Visualization
- Graph Layout
- Data Frames Meta
- Julia Data
- Data Read
- Hypothesis Tests
- Gladfly
- Stats
- RDataSets
- DataFrames
- Distributions
- Data Arrays
- Time Series
- Sampling