This tutorial guides you in setting up a system for collecting Tweets. Not in Apache Spark or Apache Flink, but just in Python. In many use cases, just a single computing node can collect enough Tweets to draw decent conclusions. In future blog posts, I will explain how to collect Tweets using a cluster (and with either Apache Spark or Apache Flink). But for now, lets focus on a simple Pythonic harvester!
How can we use machine learning to predict stockprices? In this tutorial we will make Python scripts for doing sentiment analysis on Tweets and it is explained how to use it for making predictions.
As an example, suppose we had €1000,- at the first of January of 2014 and suppose we could use the algorithm which is described in this tutorial. Then it would generate €2901,- in total on the 22th of February, 2017! The total amount of money (cash + investments) is shown in the next figure:
Despite the patience you need to have, it will be worth the waiting time eventually. As mentioned in , moods in tweets are a good indication of the movement of closing prices on a stock market. In this article, we will only predict how positive or how negative a tweet is. But it turns out that this is giving predictive signals which is accurate enough for our purposes.
This post gives a minimum working example such that you can launch your Django application on Amazon servers using Elastic Beanstalk. The only things you need is a Django application, Python 3 and an Amazon account. Before we start, make sure you have installed the Amazon CLI. Let’s start!
There are lots of Python packages available on the internet. The aim of this post is to give you an overview of scientifically oriented Python packages, sorted per topic. The list will be updated regularly. If you have any recommendations, feel free to give your addition in the comments!
- NumPy – Powerful computational framework.
- pandas – Data structures and data analysis.
- matplotlib – Plotting and visualization tools.
- SymPy – For working with symbolic mathematics.
- Numba – High performance mathematical toolkit.
- emcee – Monte Chain Monte Carlo sampling.
- Scikit image – Image processing toolkit.
- IPython – An interactive shell.
- Anaconda – A bundle of the most used Python libraries.
- SciPy – A bundle of the most used scientific Python libraries.