Skip to main content

Gathering Tweets with Python

This tutorial guides you in setting up a system for collecting Tweets. Not in Apache Spark or Apache Flink, but just in Python. In many use cases, just a single computing node can collect enough Tweets to draw decent conclusions. In future blog posts, I will explain how to collect Tweets using a cluster (and with either Apache Spark or Apache Flink). But for now, lets focus on a simple Pythonic harvester!


Read More

Stock price following an upward trend.

Getting Rich using Bitcoin stockprices and Twitter!

How can we use machine learning to predict stockprices? In this tutorial we will make Python scripts for doing sentiment analysis on Tweets and it is explained how to use it for making predictions.

As an example, suppose we had €1000,- at the first of January of 2014 and suppose we could use the algorithm which is described in this tutorial. Then it would generate €2901,- in total on the 22th of February, 2017! The total amount of money (cash + investments) is shown in the next figure:


Despite the patience you need to have, it will be worth the waiting time eventually. As mentioned in [1], moods in tweets are a good indication of the movement of closing prices on a stock market. In this article, we will only predict how positive or how negative a tweet is. But it turns out that this is giving predictive signals which is accurate enough for our purposes.


Read More

Top Data Science Blogs

Huray! The Data Blogger blog is enlisted in this top 75 of Data Science blogs. This is a good moment to give an overview of some of the most influential blogs for Data Science.

Data Science Central

 Data Science Central has multiple authors. Besides blog posts they also provide video material. You can also find job postings here. In my opinion they mainly focus on practical stuff and on discussion and they focus less on theoretical posts.


 Just like Data Science Central, DataTau collects blog posts from multiple authors. It provides an RSS feed of the most influencial blog posts. Also here is a lack of theoretical posts, but many practical posts and links to sources can be found here.


One of my personal favorites! This blog does provide sufficient theoretical posts and the topics are quite diverse here. This is one of my inspirational sources.

This is it for now. For more interesting Data Science blog you should definitely take a look at Feedspot. I will be back in a few weeks and start writing posts again. As for now, please fill in the poll in the menu, since the next post will be based on that.

Read More