Interactive visualization of large data-sets (2011 DOWNIE)

This project, undertaken in collaboration with Rensselaer Polytechnic Institute expanded our open source platform Field to make it an advanced tool for the interactive visualization of large datasets on the basis of data-mining. It was funded by the National Science Foundation (EAGER grant IIS-1048440).

Implementation

  • Linux Port. Field was successfully ported to the Linux OS and is now running on Linux machines outside OpenEndedGroup.

  • Javascript extensions. Field now supports remote, web-browser-hosted Javascript execution.

  • Generic chart-plotting API. A new high-level chart drawing system was built directly inside Field’s high performance OpenGL graphics system. Field’s new charting reuses the dynamic scoping techniques used elsewhere in Field’s codebase to create a set of reusable, modifiable class of chart types in the tradition of the Google Motion Chart API – adding, of course, a 3d dimensional layout canvas and GLSLang shader processing of the geometry being drawn.

  • Graph navigating visualizations. A fresh approach to data visualization was to be found by attacking problems that are impossible to approach using static layout graph algorithms – interaction & animation – reusing the new automatic transition generation work from the charting API. We believe that we have the beginnings of an interesting way to visualize graphs that privileges visualization “as a mode of thought” rather than visualization “as a mode of presentation”.

  • Directly embedded web content in graphics system. By embedding WebKit into Field, we are now able to draw 3d geometry textured with rendered web content. By remapping mouse events from window-space into geometry-space, we are able to interact with these surfaces as if they were tabs in a running web-browser. And by inspecting the Document Object Model of the underlying HTML content, we are able to render overlays and visualizations of web content along side these 3d windows.

  • An Incanter / Clojure / SVG workbench. We added support for Clojure (a relatively recent Lisp-like programming language) and Incanter (a statistics and visualization library for Clojure). This is partly as a proof of concept and partly in direct response to the needs of the graduate students of RPI, to close a gulf between environments where serious exploratory statistics work can be done (for example the programming environment “R”) and useful visualization work can happen (for example, Javascript + Google Visualization API). Part of this work resulted in a new Java Graphics2D / SVG / Field bridge to capture the output of visualization libraries that originate SVG files or Java drawing commands and re-inject this geometry back into high-speed Field rendering canvases.

  • New language support for Field. Ruby has been added to the list of languages that Field supports. Work has started on bindings for the the audio specific programing language Supercollider. Mirah, which builds on Ruby, is an interesting language for data visualization because it compiles to fairly conventional, non-reflective Java – so it potentially scales far better to large datasets than Field’s core language, Python, which remains essentially interpreted. Supercollider is likely to be the language of choice for real-time DSP work, including not just the playback of sounds in Field but also realtime analysis. Both of these languages may have something to contribute as we enter the next phase of this project.

Key Tutorials and Use Examples

  • “RelFinder in Field” tutorial. This is a long, detailed tutorial showing the step by step creation of a Field-based environment for visualizing relationships found within semantic web data. The goal of the tutorial is to construct a workbench that shares much of the functionality of “RelFinder” – an open source, and well-crafted online example of such a tool. The resulting implementation inside Field is radically shorter, but it’s also radically transparent – the contributions, difficulties and opportunities of the research present in the original work that we’ve duplicated here are much more visible.

  • “D3 in Field” tutorial. Here we duplicate the demo set of the D3 visualization framework in Field. This is a very carefully crafted, latest generation JavaScript based library for delivering visualizations inside browsers and it has provided a broad and perfectly typical range of visualization types for us to test Field’s basic visualization library against.

  • VSTO Ontology Explorer – continuing on from the “D3 in Field” work above, we constructed an interactive, navigable view of the ontologies associated with the Virtual Solar-Terrestrial Observatory. Rendered using D3, developed and served in Field this “live coded” web app draws together the Javascript support of Field with earlier LOGD/SPARQL work.

  • WebGL – The recent release into the mainstream of WebGL – the browser hosted JavaScript bindings for OpenGL – also closes the gap between Field’s graphics system and the possibilities of Field’s online experience. WebGL is the in-browser Javascript bindings to OpenGL, and it is now shipping in current versions of Chrome and Firefox. Ultimately this will allow a graphics system that is not too different from Field’s to run in web-browsers directly. To this end we have created a tutorial / demonstration showing the creation of an in browser visualizer for the popular .PLY file format.

  • GPU accelerated Force Directed Layout (FDL). One thing that Field’s graphics system can reach with relative ease that other runtimes essentially cannot is GPU accelerated versions of these algorithms. We have written a prototype layout implementation that remains interactive at vastly higher node counts than CPU implementations. By placing all of the implementation – geometry setup in Python together with GLSLang shader code – in a single environment we believe this implementation to be the clearest, most self-contained example of GPU accelerated FDL broadly available.

Publications and Products:

  • Downie, M, Kaiser, P., Enloe, D., Fox, P., Henlder, J., Goebel, J. & Ameres, E., 2011, Evolving a Rapid Prototyping Environment for Visually and Analytically Exploring Large-Scale Linked Open Data, Proceedings of the 2011 IEEE Symposium on Large-Scale Data Analysis and Visualization, to appear.

  • [The Field documentation and tutorial website] (http://openendedgroup.com/field) Especially:

    [Field – Live Coding JavaScript and Python] (http://vimeo.com/31452523) [Field – Protovis + SPARQL] (http://vimeo.com/31452715) [Field – WebGL] (http://vimeo.com/31458737)

Participants, Management and Coordination:

  • Johannes Goebel, co-PI – project management
  • Peter Fox, co-PI – concept development, project management, outreach
  • Marc Downie, consultant – concept development, programming, documentation
  • Paul Kaiser, consultant – concept development, documentation, and outreach
  • Eric Ameres – developed consumer stereo playback for Field; user testing of Field
  • Dylan Enloe – graph exploration and early graphics system testing
  • Alvaro Graves – Javascript d3 and Sparkle programming