[1] aQuery: http://donhopkins.com/mediawiki/index.php/AQuery
Morgan Dixon's work is truly breathtaking and eye opening, and I would love for that to be a core part of a scriptable hybrid Screen Scraping / Accessibility API approach.
Screen scraping techniques are very powerful, but have limitations. Accessibility APIs are very powerful, but have different limitations. But using both approaches together, screencasting and re-composing visual elements, and tightly integrating it with JavaScript, enables a much wider and interesting range of possibilities.
Think of it like augmented reality for virtualizing desktop user interfaces. The beauty of Morgan's Prefab is how it works across different platforms and web browsers, over virtual desktops, and how it can control, sample, measure, modify, augment and recompose guis of existing unmodified applications, even dynamic language translation, so they're much more accessible and easier to use!
----
James Landay replies:
Don,
This is right up the alley of UW CSE grad student Morgan Dixon. You might want to also look at his papers.
Don emails Morgan Dixon:
Morgan, your work is brilliant, and it really impresses me how far you've gone with it, how well it works, and how many things you can do with it!
I checked out your web site and videos, and they provoked a lot of thought so I have lots of questions and comments.
I really like the UI Customization stuff, and also the sideviews!
Combining your work with everything you can do with native accessibility APIs, in an HTML/JavaScript based, user-customizable, scriptable, cross platform user interface builder like (but transcending) HyperCard would be awesome!
I would like to discuss how we could integrate Prefab with a Javascriptable, extensible API like aQuery, so you could write "selectors" that used prefab's pattern recognition techniques, bind those to JavaScript event handlers, and write high level widgets on top of that in JavaScript, and implement the graphical overlays and gui enhancements in HTML/Canvas/etc like I've done with Slate and the WebView overlay.
Users could literally drag controls out of live applications, plug them together into their own "stacks", configure and train and graphically customize them, and hook them together with other desktop apps, web apps and services!
For example, I'd like to make a direct manipulation pie menu editor, that let you just drag controls out of apps and drop them into your own pie menus, that you can inject into any application, or use in your own guis. If you dragged a slider out of an app into the slice of a pie menu, it could rotate it around to the slice direction, so that the distance you moved from the menu center controlled the slider!
While I'm at it, here's some stuff I'm writing about the jQuery Pie Menus. http://donhopkins.com/mediawiki/index.php/JQuery_Pie_Menus
----
Web Site: Morgan Dixon's Home Page. http://morgandixon.net/
Web Site: Prefab: The Pixel-Based Reverse Engineering Toolkit. https://web.archive.org/web/20130104165553/http://homes.cs.w...
Video: Prefab: What if We Could Modify Any Interface? Target aware pointing techniques, bubble cursor, sticky icons, adding advanced behaviors to existing interfaces, independent of the tools used to implement those interfaces, platform agnostic enhancements, same Prefab code works on Windows and Mac, and across remote desktops, widget state awareness, widget transition tracking, side views, parameter preview spectrums for multi-parameter space exploration, prefab implements parameter spectrum preview interfaces for both unmodified Gimp and Photoshop: http://www.youtube.com/watch?v=lju6IIteg9Q
PDF: A General-Purpose Target-Aware Pointing Enhancement Using Pixel-Level Analysis of Graphical Interfaces. Morgan Dixon, James Fogarty, and Jacob O. Wobbrock. (2012). Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '12. ACM, New York, NY, 3167-3176. 23%. https://web.archive.org/web/20150714010941/http://homes.cs.w...
Video: Content and Hierarchy in Prefab: What if anybody could modify any interface? Reverse engineering guis from their pixels, addresses hierarchy and content, identifying hierarchical tree structure, recognizing text, stencil based tutorials, adaptive gui visualization, ephemeral adaptation technique for arbitrary desktop interfaces, dynamic interface language translation, UI customization, re-rendering widgets, Skype favorite widgets tab: http://www.youtube.com/watch?v=w4S5ZtnaUKE
PDF: Content and Hierarchy in Pixel-Based Methods for Reverse-Engineering Interface Structure. Morgan Dixon, Daniel Leventhal, and James Fogarty. (2011). Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '11. ACM, New York, NY, 969-978. 26%. https://web.archive.org/web/20150714010931/http://homes.cs.w...
Video: Sliding Widgets, States, and Styles in Prefab. Adapting desktop interfaces for touch screen use, with sliding widgets, slow fine tuned pointing with magnification, simulating rollover to reveal tooltips: https://www.youtube.com/watch?v=8LMSYI4i7wk
Video: A General-Purpose Bubble Cursor. A general purpose target aware pointing enhancement, target editor: http://www.youtube.com/watch?v=46EopD_2K_4
PDF: Prefab: Implementing Advanced Behaviors Using Pixel-Based Reverse Engineering of Interface Structure. Morgan Dixon and James Fogarty. (2010). Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '10. ACM, New York, NY, 1525-1534. 22% https://web.archive.org/web/20150714010936/http://homes.cs.w...
PDF: Prefab: What if Every GUI Were Open-Source? Morgan Dixon and James Fogarty. (2010). Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '10. ACM, New York, NY, 851-854. https://web.archive.org/web/20141024012013/http://homes.cs.w...
Morgan Dixon's Research Statement: http://morgandixon.net/morgan-dixon-research-statement.pdf
Community-Driven Interface Tools
Today, most interfaces are designed by teams of people who are collocated and highly skilled. Moreover, any changes to an interface are implemented by the original developers and designers who own the source code. In contrast, I envision a future where distributed online communities rapidly construct and improve interfaces. Similar to the Wikipedia editing process, I hope to explore new interface design tools that fully democratize the design of interfaces. Wikipedia provides static content, and so people can collectively author articles using a very basic Wiki editor. However, community-driven interface tools will require a combination of sophisticated programming-by-demonstration techniques, crowdsourcing and social systems, interaction design, software engineering strategies, and interactive machine learning.
The way jQuery widgets can encapsulate native and browser specific widgets with a platform agnostic api, you could develop high level aQuery widgets like "video player" that knew how to control and adapt many different video player apps across different platforms (youtube or vimeo in browser, vlc on windows or mac desktop, quicktime on mac, windows media player on windows, etc). Then you can build much higher level apps out of widgets like that.
Target aware pointing is one of many great techniques he shows can be layered on top of existing interfaces, without modifying them.
I'd like to integrating all those capabilities plus the native Accessibility API of each platform into a JavaScript engine, and write jQuery-like selectors for recognizing patterns of pixels and widgets, creating aQuery widgets that tracked input, drew overlays, implemented text to speech and voice control interfaces, etc.
His research statement sums up where it's leading: Imagine wikipedia for sharing gui mods!
Berkeley Systems (the flying toaster screen saver company) made one of the first screen readers for the Mac in 1989 and Windows in 1994. https://en.wikipedia.org/wiki/OutSpoken
Richard Potter, Ben Shneiderman and Ben Benderson wrote a paper called Pixel Data Access for End-User Programming and graphical Macros, that references a lot of earlier work. https://www.cs.umd.edu/~ben/papers/Potter1999Pixel.pdf