In this era of Big Data, an issue that has been escalating off- late relates to data fragmentation across organisations. This makes the process of analytics and reporting to become even more complex. This is where data pipeline tools come into play. To define it, a data pipeline denotes a set of actions carried out to extract data from different sources. For a startup, building a data pipeline is an important aspect of data science. They need to gather data points from all users and process it in real- time for developing data products.
In this blog post, I will learn you how you can mine opinions about companies from news articles. I will share how I scraped thousands of news articles in a few minutes and how one could classify the opinion expressed in the titles of the news articles. This information could be used for example to help with watching competitors of a company or to predict global trends.
Recently I came across the concept of CTF (Capture The Flag) which are security competitions and I really enjoy these kinds of the competitions. One of my favorites is OverTheWire. They provide different kinds of games and one of these is Natas, in which you need to attack a webserver. The concept is simple: the creator of the game designs a normal functioning webserver and intentionally puts in a bug. We need to find a password in order to get to the next level which can be found by exploiting the bug. That’s why it is called Capture The Flag. So lets get started!