Useful free datasets (Part 1)

Here I share a list of datasets free for download


American Economic Ass. (AEA):
World bank:

Data Science Practice

This section contains data sets used in the book “Doing Data Science” by Rachel Schutt and Cathy O’Neil (O’Reilly 2014)
Datasets on the book site:
Enron Email Dataset:
GetGlue (time stamped events: users rating TV shows):
Titanic Survival Data Set:
Half a million Hubway rides:


CBOE Futures Exchange:
Google Finance: (R)
Google Trends:
St Louis Fed: (R)
Yahoo Finance: (R)

To view more go to:



Like Alyzer-Free Social Media Tool for Facebook

LikeAlyzer is an online tool for companies that want to be successful on Facebook. It  helps you to measure and analyze the potential and effectiveness of your Facebook Pages


  • It provides daily updated Facebook statistics for your company or other Pages of interest.
  • It enables you to monitor and compare your efforts with those of the world’s popular brands or relevant companies, such as competitors.


Graph Visualization with Gephi

Gephi is an interactive visualization and exploration solution that supports dynamic and hierarchical graphs. It runs on Windows, Linux and Mac OS X. Gephi is open-source and free.


The goal is to help data analysts to make hypothesis, intuitively discover patterns, isolate structure singularities or faults during data sourcing. It is a complementary tool to traditional statistics, as visual thinking with interactive interfaces is now recognized to facilitate reasoning.

  • Real time visualization
  • Layout algorithms (force- based and multi-level)
  • Metric (Betweeness, Closeness, Diameter, Clustering Coefficient, Average shortest path, PageRank, HITS, Community Detection,  Random Generators)
  • Dynamic Network Analysis
  • Create Cartograpy
  • Clustering and hierarchical graphs
  • Dynamic Filtering



To learn more about it, go to


How artificial intelligence is transforming the financial industry

By Michelle Fleury BBC business correspondent, New York


Your next stockbroker might just be a computer.

More and more, financial firms are turning to machines to do the job humans have done for decades.

Last spring, wealth management firm Charles Schwab launched a new service called Schwab Intelligent Portfolios. The service is unique in that it’s not a person who decides where to invest your money, it’s an algorithm – lines of code programmed into a computer.

“It’s lower cost for the investor,” says Tobin McDaniel, who leads the Schwab Intelligent Portfolios team.

“As opposed to working with a traditional advisor where you might pay up to 1%, here you get portfolio management at essentially no management fee.”



To learn more: How artificial intelligence is transforming the financial industry

The Text Mining Handbook Advanced Approaches in Analyzing Unstructured Data

Reading for this week: The Text Mining Handbook. Advanced Approaches in Analyzing Unstructured Data (Feldman & Sanger, 2006)




Text mining tries to solve the crisis of information overload by combining techniques from data mining, machine learning, natural language processing, information retrieval, and knowledge management. In addition to providing an in-depth examination of core text mining and link detection algorithms and operations, this book examines advanced pre-processing techniques, knowledge representation considerations, and visualization approaches. Finally, it explores current real-world, mission-critical applications of text mining and link detection in such varied fields as M&A business intelligence, genomics research and counter-terrorism activities.

Sentiment Analysis. Mining Opinions, Sentiments, and Emotions

Recommended book of the week: Sentiment Analysis: Mining Opinions, Sentiments and Emotions (B. Lui, 2015)




Sentiment analysis is the computational study of people’s opinions, sentiments, emotions, and attitudes. This fascinating problem is increasingly important in business and society. It offers numerous research challenges but promises insight useful to anyone interested in opinion analysis and social media analysis. This book gives a comprehensive introduction to the topic from a primarily natural-language-processing point of view to help readers understand the underlying structure of the problem and the language constructs that are commonly used to express opinions and sentiments. It covers all core areas of sentiment analysis, includes many emerging themes, such as debate analysis, intention mining, and fake-opinion detection, and presents computational methods to analyze and summarize opinions. It will be a valuable resource for researchers and practitioners in natural language processing, computer science, management sciences, and the social sciences.


XIII Congreso Internacional en Innovación Tecnológica Informática

CIITI 2015

¿Qúe es el CIITI?

En el CIITI se genera un espacio de reflexión abierta, participativa e inclusiva, sobre el impacto de la tecnología informática en los distintos campos de la ciencia, presentando las innovaciones y nuevos conocimientos a la sociedad, convirtiéndose en un espacio para la difusión, promoción y reflexión sobre la importancia de la innovación tecnológica informática como factor de competitividad.

Este Congreso fomenta el círculo virtuoso de la articulación entre el Gobierno, las Empresas, las Universidades y los Centros de Investigación y Desarrollo nacionales e Internacionales. Es desde esta visión interdisciplinaria, que se abordan los procesos que provocan cambios sociales a partir de las nuevas tecnologías y se renueva, año a año, la importancia de pensar el desarrollo estratégico de las nuevas tecnologías como pilar fundamental para el crecimiento equitativo y sustentable del país.

Capítulo Buenos Aires
30 de Septiembre de 2015
Palais Rouge
Jerónimo Salguero 1443/49 – Ciudad Autónoma de Buenos Aires

Ver Agenda CIITI 2015, Buenos Aires