Spelling correction is not a trivial task for a computer. Better and better models are invented to tackle problems such as spelling correction. Language models are the kind of models that are being used for this task. Language models are also used for correcting errors in speech recognition, machine translation, for language and authorship identification, text compression and topic relevance ranking. In this article, language models are being used for a simple spelling correction application.
There are lots of Python packages available on the internet. The aim of this post is to give you an overview of scientifically oriented Python packages, sorted per topic. The list will be updated regularly. If you have any recommendations, feel free to give your addition in the comments!
- NumPy – Powerful computational framework.
- pandas – Data structures and data analysis.
- matplotlib – Plotting and visualization tools.
- SymPy – For working with symbolic mathematics.
- Numba – High performance mathematical toolkit.
- emcee – Monte Chain Monte Carlo sampling.
- Scikit image – Image processing toolkit.
- IPython – An interactive shell.
- Anaconda – A bundle of the most used Python libraries.
- SciPy – A bundle of the most used scientific Python libraries.
The series “Data Mining on Medical Data” is a series in which several data mining techniques are highlighted. The series are written in collaboration with John Snow Labs which provided me the medical datasets. In this article basic Text Mining techniques will be highlighted and some of the results are presented.