This tutorial guides you in setting up a system for collecting Tweets. Not in Apache Spark or Apache Flink, but just in Python + Tweepy. In many use cases, just a single computing node can collect enough Tweets to draw decent conclusions. In future blog posts, I will explain how to collect Tweets using a cluster (and with either Apache Spark or Apache Flink). But for now, lets focus on a simple Pythonic harvester! If you are interested in scraping a website, you should definitely read this article.Read more · 9 minutes
In this course, you will learn how to use the Python Pandas. After the course, you will be able to:
- Load and transform your data
- Visualizing data using line plots, scatter plots and histograms
- Merging and storing data
The course also includes more advanced topics, such as data parallelization and aggregation.(more…) Read more
The series “Data Mining with Python on Medical Datasets for Data Mining” is a series in which several data mining techniques are highlighted. The series are written in collaboration with John Snow Labs which provided me the medical datasets. In this article basic Text Mining techniques will be highlighted and some of the results are presented.
By the way, if you are interested in Deep Learning you should definitely read this article on implementing a GRU in Python using Tensorflow.Read more · 16 minutes