Uncategorized
Minimum Spanning Tree of Urban Tapestries messages
Recently I managed to calculate the Minimum Spanning Tree (MST) of the Urban Tapestries dataset that was collected during the trial of 2004. From Wikipedia:
Given a connected, undirected graph, a spanning tree of that graph is a subgraph which is a tree and connects all the vertices together. A single graph can have many different spanning trees. We can also assign a weight to each edge, which is a number representing how unfavorable it is, and use this to assign a weight to a spanning tree by computing the sum of the weights of the edges in that spanning tree. A minimum spanning tree or minimum weight spanning tree is then a spanning tree with weight less than or equal to the weight of every other spanning tree. More generally, any undirected graph has a minimum spanning forest.
Copyright notice: the present content was taken from the following URL, the copyrights are reserved by the respective author/s.
For the Urban Tapestries dataset, I had to adapt a version of Kruskal’s algorithm for minimum spanning trees that was greatly implemented by Prof. David Eppstein (thanks!). So the final result is visible below. Now the next step is to calculate a MST over the semantic distribution of the points and then find a way to compare these two trees to measure the distortion of the two layers.
Copyright notice: Google 2005, The GeoInformation Group 2006, the copyrights are reserved by the respective author/s.
Tags: anamorphosis maps, beta-skeletons, clustering, information metric, information visualization, map algorithms, maps, spatial clustering
Using Visualizations to Analyze Workspace Activity and Discern Software Project Evolution
R. M. Ripley, A. Sarma, and A. van der Hoek. Using visualizations to analyze workspace activity and discern software project evolution. ISR Technical Report UCI-ISR-06-1, University of California, Irvine, California, USA, 2006. [pdf]
———————–
In this paper, the authors presented a prototype of a 3D visualization for workspace activity. This visualization shows only active workspaces and artifacts and depicts activities as changes to artifacts. Each change to an artifact is denoted as a cylinder and stacks of such cylinders represent activities in a particular workspace (developer-centric) or activities carried on specific artifact (artifact-centric). They applied the visualization to several open-source projects and demonstrated pro ject evolution and other interesting situations.
The visualization has two primary modes: developer-centric and artifact-centric, which we will discuss shortly. Common to both modes, the visualization shows only active artifacts and workspaces, thereby eliminating the clutter of inactive entities. Stacks of cylinders represent workspace activities
(changes to artifacts), with each cylinder corresponding to a particular artifact in a particular workspace with the dimensions representing the size of the change (the bigger the change, the larger the cylinder).
In the developer-centric view (see the figure, inset), a stack of cylinders represents a developer’s workspace with each cylinder representing an artifact being changed in that workspace. Workspaces with many activities correspond to tall stacks of cylinders. The stacks of cylinders with the most recent changes are placed in the front of the view and, as time elapses, stacks for workspaces with “older” activity slowly start moving to the back, representing dormancy. Thereby a user is able to quickly discern the loci of activities from the height of the stacks and recency of these activities from their position. The artifact-centric view (see the figure, bottom) behaves similarly to the developer view, but instead each stack of cylinders represents a particular artifact and each cylinder in the stack represents changes to that artifact made by a particular workspace.
Copyright notice: the present content was taken from the following URL, the copyrights are reserved by the respective author/s.
Tags: collaboration tools, information visualization, software visualization
A visual dictionary of symbols
Trying to help Nicolas to find nice symbols for the CatchBob! semi-structured interface, I recalled this nice web site that I visited already in the past. It’s a visual dictionary of symbols. For each pictogram it shows references that uses the symbol and a description of the cross-cultural meaning. Different search tool are available.
Tags: information visualization, universal cognitive distance
Drift in Toulouse: a participatory simulation
The last 16th of October (2005) a participatory simulation was organised a Toulouse during the Science Fair for the World Year of Physics. The experience simulated the famous Brownian movement, that was theorised by Einstein hundred years ago.
In this experience, the particles with random movements have been represented by high-school students. They moved randomly in the city center choosing randomly the direction of their movements. In the following map is possible to see a representation of the particles’ position after 50 crossing.
Results showed a distortion of the obtained results from the expected results (i.e., the position of the center of gravity is shifted on the right hand-side on the map). A further exploration of the data allowed to explain this shift by the presence of an architectural barrier on the city path. The river, in fact, played an important role in directing the direction of the particles on the right upper quadrant of the map.
This is in relation with my previous post where I was highlighting that geography and geometry are not synonymous. We live and we perceive a geographical world, which we try to model with geometrical models. These models are often limited as not incorporating important factors as, in this case, natural barriers.
Tags: field tools, map algorithms, maps, mobile learning, psychogeography, participatory simulations, urban exploration
Podzinger: a podcast search tool
Podcasts are rapidly emerging as a popular on-line audio publishing vehicle. But trying to find podcasts that are of interest to us is reminiscent of the early days of the internet. Instead of using general search techniques, sites attempted to organize all the content on the internet into categories. The diversity of the available content became too cumbersome for mere categorization to be an effective means for people to find what they wanted. Search evolved to where it is today, that is, to look inside websites quickly for words and phrases.
Podcasts have been subjected to the same primitive search through categorization … until now. PODZINGER looks inside podcasts, not just the metadata, letting you search podcasts in the same way that you search for anything else on the web.
When you type in a word or terms, PODZINGER not only finds the relevant podcasts, but also highlights the segment of the audio in which they occurred. By clicking anywhere on the results, the audio will begin to play just where you clicked. There are also controls that let you back up, pause, or forward through the podcast. Or you can download the entire podcast.
Tags: search engine
Dynamic business models: mass customization
Deciding how much latitude to give users is an essential part of any mass-customization program, says B. Joseph Pine, the author of Mass Customization: The New Frontier in Business Competition (Harvard Business School Press, 1993). “Fundamentally,” he says, “customers don’t want choice. They just want exactly what they want. Your job is to help them figure out what it is they want, because often they don’t know or can’t articulate it.”
That’s in essence at the core of LEGO new business model that articulates in 3 steps: (1) the user chose the design s/he wants using a support tool; (2) The factory creates an ad-hoc packaging of the bricks; (3) the packaging are sent to the customer.
I am not sure this model fulfill the idea of helping the user to figure out what s/he wants but I really like the idea of customization of the bricks set.
(via)
Meeting with Jacques Lèvy: exploring spatialised messages
Yesterday I had a wonderful conversation with Jacques Lévy and André Ourednik (of the Choros laboratory, EPFL) on the methodology to explore a dataset of “spatialised messages”. We drafted a couple of points and ideas that I am trying to summarise in this post.
1. Geometry and Geography are not synonyms. This is the main point that Jacques raised as often these two dimensions are conceived as the same thing. Geometry might be a good approximation of geography under exceptional circumstances. For instance, if a part of the city has a linear structure and the density of buildings is constant and the transport systems are available in a uniform way on the territory, and etc. So in this case we can use this approximation. However current approach to spatial cognition tend to move from the assumption that geography is Euclidean. If people have distorted perception of this “reality” then they are wrong and these false perception should be treated as “mistakes”. Of course Jacques and I agreed that we can consider different starting assumptions lie the fact that the geography of a place is given by these distortions which are not mistakes.
2. On the pragmatical approach on the analysis of the messages, Jacques proposed to measure the messages’ geometrical average distance against the threads’ average connection distance. This measure might give us a view on the proportion of conversations that extend beyond the local clusters to represent similar features of the city on different neighborhoods.
3. A second idea emerged in the meeting is that of comparing the messages clusters (or the messages distribution) with other census information on the same area of the city. For instance: does the highest density of the points correspond with the highest density of shops in the area? Does it matches with the area that the users uses the most for various reasons (i.e., work, leisure, residence)? An operational exploration that we agreed to perform on the data is to confront the messages distribution with different layers that we can obtain from statistical bureau or census data.
4. Finally André Proposed to work on a possible anamorphosis map considering different attributes of the messages, as for instance the semantic distance between each pair of points. Other dimensions can be explored as the time evolution of the conversations, the threading between the messages or the social network that developed between the users of the system. The basic principle of the anamorphosis is that the map is divided in cells. For each morph a feature is chosen and measured in each cell. Then the area of the cell is set proportional on the feature assigned so that the ultimate map will equalise the chosen feature across all the map.
During our last meeting with Pierre, we sketched briefly a couple of other possible explorations to perform on the dataset. Pierre was proposing to build different maps giving more importance in turn to the geographical, the semantic or the social dimension of the data. Then for each map we could measure a couple of parameters to confront like the dispersion (are the messages more dense or sparse); the structure of the connections (what is the form of the links of the Minimal Walking Tree built on the map); number of links per node (again with the MWT); etc.
We ended the meeting with the proposition of exploring some of the directions highlighted.
Copyright notice: the present content was taken from the following URL, the copyrights are reserved by the respective author/s.
Tags: anamorphosis maps, map algorithms, maps, psychogeography, spatial clustering, urban exploration
Palantir: Raising Awareness among Configuration Management Workspaces
A. Sarma, Z. Norozi, and A. van der Hoek. Palantir: Raising awareness among configuration management workspaces. In Proceedings of, pages 444–454, Portland, Oregon, USA, May 2003. IEEE. [pdf]
——————-
This paper presents Palantir a system to enhance users’ awareness in a configuration managment workspace. The starting assumption is that current configuration management workspaces isolate developers. This isolation is said to have positive and negative effects like overlapping changes and low shared understanding of each other’s code. Palantir is said to overcome this bas isolation by inverting information flow from push to pull. Awareness, it the understanding of the activities of the others, which provide a context for your own activity.
Previous coordination systems are limited in the sense that they inform the users (developers) only of direct conflicts concerning individual artefacts. What is missing is an overall vieo of the workspace. Palantir builds on top of existing configuration management facilities and concentrates on the collection, distribution, organization and presentation of relevant workspace information.
Palantir has two visualization one is a ticker tape, the other is a fully graphical visualization which maintain an overview of workspace activities. The artefacts can be filtered by different criteria like severity. Palantir exhibits three key properties: (1) its coordination mechanisms is based on workspace rather than repository information; (2) it informs continuosly developers of other ongoing efforts; (3) it provides an overall view of other workspaces that support the detection of both direct and indirect conflicts.
Tags: collaboration tools, information visualization, software visualization
PhD Annual Report
M. Cherubini. Phd annual report. Ecole Polytechnique Fédérale de Lausanne, Ecoublens, Station 1, CH-1015 Lausanne, Switzerland, 2006. [pdf]
———————
This work targets Collaborative Annotations of a Map in a mobile setting. This is a form of communication that makes explicit usage of the geographical/physical context as referent to the message content. The goal of this study is to develop a computational support for such communication, defining a model that enable to integrate spatial information with the textual information produced through computer-mediated communication. As first step, I will analyse datasets of these particular messages with the aim of understanding their peculiarities in comparison with comparable canonical forms. After, I will use this information to build a specific information retrieval engine that will support the users exploration of the information space.