Fictional narratives can be a useful source of analysis if we think in terms of the connections among characters. Consider the characters as nodes and the edges which establish the relationships among characters, we can then study how the network is structured, its density, among other features.
Learn more at: Shakespearean Tragedy
Other sites also offer great variaty of datasets free
Yahoo! Labs – Collection of datasets related to language, social, marketing and more. They’re well organized and most of them are hundreds of megabytes in size.
Awesome Publid Datasets – This is a Github repository that’s a list of publicly available datasets organized by category.
Gapminder – Hundreds of datasets on world health, economics, population, etc. All of it is viewable online within Google Docs, and downloadable as spreadsheets.
The Info – Mostly large datasets. The site is losing momentum, but the data available here is still gold.
The Data Hub – Hosted by CKAN. Most of these datasets come from the government.
Datamob – List of public datasets.
Numbrary – Lists of datasets.
Kaggle – Kaggle is a site that hosts data mining competitions. Each competition provides a data set that’s free for download.
SNAP – Stanford’s Large Network Dataset Collection. This list has several datasets related to social networking. Lots of fun in here!
More available datasets at: https://r-dir.com/reference/datasets.html
In Think to Start you’ll find the steps needed to analyze Linkedin with R:
1. Get the package
2. Authenticate with Linkedin
3. Anlayze Linked in with R
Also Go to https://github.com/mpiccirilli/Rlinkedin
LikeAlyzer is an online tool for companies that want to be successful on Facebook. It helps you to measure and analyze the potential and effectiveness of your Facebook Pages
- It provides daily updated Facebook statistics for your company or other Pages of interest.
- It enables you to monitor and compare your efforts with those of the world’s popular brands or relevant companies, such as competitors.
Gephi is an interactive visualization and exploration solution that supports dynamic and hierarchical graphs. It runs on Windows, Linux and Mac OS X. Gephi is open-source and free.
The goal is to help data analysts to make hypothesis, intuitively discover patterns, isolate structure singularities or faults during data sourcing. It is a complementary tool to traditional statistics, as visual thinking with interactive interfaces is now recognized to facilitate reasoning.
- Real time visualization
- Layout algorithms (force- based and multi-level)
- Metric (Betweeness, Closeness, Diameter, Clustering Coefficient, Average shortest path, PageRank, HITS, Community Detection, Random Generators)
- Dynamic Network Analysis
- Create Cartograpy
- Clustering and hierarchical graphs
- Dynamic Filtering
To learn more about it, go to http://gephi.org/
RDatamining is a site where you can download books and slides related to Data Mining with R for personal non-commercial use.
To view this information go to:http://www.rdatamining.com/docs
By Michelle Fleury BBC business correspondent, New York
Your next stockbroker might just be a computer.
More and more, financial firms are turning to machines to do the job humans have done for decades.
Last spring, wealth management firm Charles Schwab launched a new service called Schwab Intelligent Portfolios. The service is unique in that it’s not a person who decides where to invest your money, it’s an algorithm – lines of code programmed into a computer.
“It’s lower cost for the investor,” says Tobin McDaniel, who leads the Schwab Intelligent Portfolios team.
“As opposed to working with a traditional advisor where you might pay up to 1%, here you get portfolio management at essentially no management fee.”
To learn more: How artificial intelligence is transforming the financial industry
Reading for this week: The Text Mining Handbook. Advanced Approaches in Analyzing Unstructured Data (Feldman & Sanger, 2006)
Text mining tries to solve the crisis of information overload by combining techniques from data mining, machine learning, natural language processing, information retrieval, and knowledge management. In addition to providing an in-depth examination of core text mining and link detection algorithms and operations, this book examines advanced pre-processing techniques, knowledge representation considerations, and visualization approaches. Finally, it explores current real-world, mission-critical applications of text mining and link detection in such varied fields as M&A business intelligence, genomics research and counter-terrorism activities.