How to scrape a website using Python + Scrapy in 5 simple steps

Ask your question on our new Q&A website!

In this Python Scrapy tutorial, you will learn how to write a simple webscraper in Python using the Scrapy framework. The Data Blogger website will be used as an example in this article.

Scrapy: An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.

By the way, if you are interested in scraping Tweets, you should definitely read this article.

(more…)

Read More

Monitoring your cluster in just a few minutes using ISA

CPU Sum.

An example after visualizing the data produced by ISA.

Suppose you have a cluster. Suppose you would like to monitor your cluster as soon as possible without installing all kind of tools on the cluster. A new software package named ISA has been created which can do centralized monitoring for you! This article is a walkthrough for ISA and helps you setting up monitoring for your cluster in just a few minutes.

Features

  • ISA can collect many node statistics such as CPU usage, memory usage and disk I/O.
  • It is easy to setup and it has flexible node configuration.
  • ISA ensures minimal influence for the node statistics.
  • No setup required on the nodes, the statistic management is done centrally.

In this tutorial, we will setup ISA and collect cluster statistics in a CSV.

(more…)

Read More