They Rule!: a social visualization tool of the US ruling class

They Rule aims to provide a glimpse of some of the relationships of the US ruling class. It takes as its focus the boards of some of the most powerful U.S. companies, which share many of the same directors. Some individuals sit on 5, 6 or 7 of the top 500 companies. It allows users to browse through these interlocking directories and run searches on the boards and companies. A user can save a map of connections complete with their annotations and email links to these maps to others. They Rule is a starting point for research about these powerful individuals and corporations.

We should have something like this for Italian politics. It would be fun to see everything around Mr. B.

Sunboard

Tags: , , , , ,

MontyLingua: a commonsense enriched part of speech tagger

MontyLingua is a free, commonsense-enriched, end-to-end natural language understander for English. Feed raw English text into MontyLingua, and the output will be a semantic interpretation of that text. Perfect for information retrieval and extraction, request processing, and question answering. From English sentences, it extracts subject/verb/object tuples, extracts adjectives, noun phrases and verb phrases, and extracts people’s names, places, events, dates and times, and other semantic information.

Montylinguav2-Screenshot-Small

Tags: , ,

Testing Visual Information Retrieval Methodologies Case Study

E. Morse and M. Lewis. Testing visual information retrieval methodologies case study: Comparative analysis of textual, icon, graphical, and “spring” displays. Journal of the American Society for Information Science and Technology, 53(1):28–40, 2002. [pdf]

————————-

Although many different visual information retrieval systems have been proposed, few have been tested, and where testing has been performed, results were often inconclusive. Further, there is very little evidence of benchmarking systems against a common standard. An approach for testing novel interfaces is proposed that uses bottom-up, stepwise testing to allow evaluation of a visualization, itself, rather than restricting evaluation to the system instantiating it. This approach not only makes it easier to control variables, but the tests are also easier to perform. The methodology will be presented through a case study, where a new visualization technique is compared to more traditional ways of presenting data.

Tags: , , ,

Building a document map

These days I am pretty busy working with Lorenzo on our super secret project on Context Network Graphs. On our work schedule we had a delay due to the fact that we were trying to find a decent way to show a document collection on a two-dimensional map. We started with an ordered list of documents with ranking values.

From this one-dimensional situation we had to develop a second dimension of information and I can now swear that was not easy. We choose to use triangulation and the biggest problem we fought was that some triangle did not close properly. This document demonstrate how to compute if three documents can be placed in a triangle.

To verify that I did a quick hack in Python that was showing some gaps in the circles formed between each couple of points (see picture below). This was fun. To find how to fix this was not fun at all. But finally …

Map 20060119194304

Tags: , , , , ,

meX-Search: a meta search engine

meX-Search is a meta search engine that automatically categorizes search results into thematic groups and displays them by intuitive and interactive maps.

meX-search is an experimental, non commercial meta search engine built up from april to july 2004 by Karsten Knorr during his diploma thesis in computer and media science [University of Applied Science Berlin]. The main idea of the thesis was the implementation of an intuitive and simple user interface for web clustering search engines.

Users of conventional Web search engines are often forced to sift through a long list of off-topic documents to find relevant results… Especially when the search query is general, it is often hard to find relevant resources among thousands of irrelevant ones. Search result clustering is a approach to handle such problems by grouping similar documents among search results into thematic groups.

meX is a meta search engine. Currently meX is getting the search results completely from the Yahoo-API.

The clustering of the result-snippets from Yahoo is based on Carrot2, an open source java framework for clustering textual data. Within the Carrot2 framework meX uses the Lingo Algorithm. The Authors of the Carrot2 framework and components: Dawid Weiss, Jerzy Stefanowski, Stanislaw Osinski.

Mex-Search

Tags: , , , , , ,

Carrot2: a clustering framework

Carrot2 is a research framework for experimenting with automated querying of various data sources (such as search engines), processing search results and their visualization.

Under the term “research”, we understand that the architecture of the system is oriented mostly toward flexibility, sometimes at a price of performance losses. Mechanisms such as data exchange via XML language, dynamically loaded components accessible via HTTP protocol, the use of Java as primary language of implementation — they all make the system very easy to tailor to one’s needs. Carrot2 was primarily built with search results clustering in mind, but it can be easily configured to do other, interesting things.

Components-Dataflow

Tags: , , ,

Coal, China, and India: A Deadly Combination for Air Pollution?

I found this great portal of the World Watch institute, which is an independent research organization that works for an environmentally sustainable and socially just society by providing compelling, accessible, and fact-based analysis of critical global issues. The portal offers the access to a variety of publications of synthesis of research on environmental facts. Most of the publications are accessible with a small payment to sustain the activity of the institute. I think is a small price for the quality of the information they provide.

Browsing the site I found this article on the coal consumption projections for year 2010:

The rapid growth in coal use in China and India, where pollution controls are minimal, is adding to local and long-distance pollution. More than 80 percent of Chinese cities in a recent World Bank survey had sulfur dioxide or nitrogen dioxide emissions above the World Health Organization’s threshold.

Scientists have concluded that growing up in a city with polluted air is about as harmful to a person’s health as growing up with a parent who smokes. Although air pollution is concentrated in cities, it can move well beyond them: for example, acidic lakes in Scandinavia have been linked to pollution from factories in the United States. The World Bank projected that on average 1.8 million people would die prematurely each year between 2001 and 2020 because of air pollution.

 Brain Images Pubs Vs Vsow 2005 Fossil Consumption

Tags: , , , ,

Combine: an open source crawler

Combine is an open system for crawling [harvesting and threshing (indexing)] Internet resources. The name is derived from the combine-harvester since the two perform their jobs in a similar way.

The Combine was initially developed as a part of the Development of a European Service for Information on Research and Education (DESIRE) project, which was funded by the European Commission within Telematics for Science Program.

It is later beeing modified for focused crawling by integrating the automated topic classification algorithms also developed in DESIRE with the crawler. This work is funded by Vinnova, Swedish Agency for Innovation Systems (project P22504-1 A) and the EU project ALVIS project.

Tags: , , ,