PLAN-BASED ASSISTANCE IN THE WEBBROWSER FIREFOX

Page created by Joshua Hansen
 
CONTINUE READING
PLAN-BASED ASSISTANCE IN THE WEBBROWSER FIREFOX
                             Thomas A. Bertz                                         Peter Reiss, M.A.
                      Chair for Artificial Intelligence                      Chair for Artificial Intelligence
                     University of Erlangen-Nuernberg                      University of Erlangen-Nuernberg
                      Haberstrae 2, D-91058 Erlangen                        Haberstrae 2, D-91058 Erlangen
           email: thomas.a.bertz@informatik.stud.uni-erlangen.de         email: reiss@informatik.uni-erlangen.de

ABSTRACT                                                           complex desktop. The henceforth perennial application of
We present a developed function library named pbifSys-             GUIs has though bared new problems in control and gener-
tem which will allow for remote control of the webbrowser          ated new requirements. Mainly these include:
Firefox. Instead of controlling the browser through gen-
eral pc input devices such as mouse or keyboard, we will           self-explanation Usage of menu-based applications has
use functions of the library. The developed pbifSystem                  become more intuitive and easier (by simply brows-
is meant to be a service layer that can be used by assis-               ing the menus for the function), but the complete
tance systems. The service offered appears in the pbifSys-              functional range is though hidden behind the menu
tem as planning an action sequence and visually executing               similar to the interpreters console-based applications.
it single-handed, which is appropriate for reaching the re-             By integrating documentation in the program itself,
quested goal by the user (e.g. Increase to enlarge the                  seperated handbooks can be omitted. Moreover can
font of a currently shown website). To demonstrate pbif-                the contained informations account for (interactive)
System’s mode of operations without assistance system, a                demonstrations that could be presented the user. This
toolbar (pbifBar) was built.                                            way an application could explain itself, its possible
                                                                        usage options and modes.
KEY WORDS
HMI, Planning, Application                                         usability Because of the use of menu structures it is possi-
                                                                       ble, that many functions can’t be reached directly but
                                                                       only through sequences of actions (e.g. a sequence
1 Introduction                                                         of mouse clicks). Assistence systems that have been
                                                                       made aware of the user’s goals on the one side (e.g.
1.1 Motivation                                                         through speech dialog components) and that know
                                                                       how to read current system states on the other side
In the beginning of computer science a special education               can help to re-establish ergonomics in usability with-
for handling computers was necessary. Available programs               out requiring special software educations of the user.
and commands had to be passed to a command interpreter
(shell). Despite the abstraction of the underlying machine,        accessibility Novices and with new programs or program
                                                                       versions expert users, too, are handicapped in learn-
users had and still have to know about a huge functional
range. Functionality was covered behind the interpreter or             ability of new functions if they are hidden. Further-
the handbook. Yet another abstraction layer was created                more, the control with mouse and keyboard is possibly
                                                                       only suitable to a limited extend for disabled people.
through graphical user interfaces (GUI) whereby today’s
operating systems and applications can be operated much                Assistance systems are a promising solution, because
more intuitive. The idea is to allow for novices, too, to be           they adapt to the user and not the other way round.
able to solve some standard tasks with a personal computer
autonomously without a need for studying handbooks or              1.2 Related Work
attend courses of instruction. At least in principle a sim-
ple mouse click replaced complex commands on command               The UNIX Consultant (UC) [1] is an intelligent assistence
shells today.                                                      system for the operating system UNIX. In a passive mode
Nevertheless the functionality boost of today’s applications       unskilled users can pose questions about typical UNIX
and limited space on the desktop constrains developers to          tasks to the system in natural language. UC answers with
present the functions in a clearly laid out form. This be-         concrete instructions. In an active mode UC can intervene
comes apparent in many applications in their menu struc-           and correct or optimize tasks activated by the user. To do
ture where functions belonging together are grouped to-            so UC uses several modules, e.g. a speech dialog module to
gether in menus. Their content (menuitems) is not shown            aquire the users goals, a knowledge database that contains
until the user wishes so (e.g. by a mouse click). These            syntax and semantics of the UNIX command domain and
menu structures provide a tidy, clear and still potentially        the possible user pragmatics and an inference and planning
module that allows to reason from internal and external sys-     JavaScript While XUL and CSS are pure description lan-
tem states to possible actions. This planbased attempt for           guages, JavaScript [5] can execute commands and ac-
finding solutions for user goals was exemplary for the de-           tions, react on events [6] or do calculations. To of-
veloped library pbifSystem in the work at hand.                      fer access to structure and style for programs, many
Another prominent example for assistance systems may be              browsers (so does Firefox) implement the DOM inter-
the assistant of Microsoft’s Office package [2]. It uses             face [7] in their JavaScript interpreters. The DOM in-
bayesian networks to estimate the user’s goal or problem             terface defines, how the structure of XUL documents
from its naturally formed, textual request and presents an           are mapped to a tree. Trees are important in com-
adequate solution. This already draws a distinction to               puter science because they are easy to navigate on,
the atempt of UC and the pbifSystem. While the Office-               read from and write to.
Assistent exactly knows the way (plan) to solve the prob-
lem or to reach the goal (it is static, hardcoded), the goal     XPCOM Regarding security JavaScript only has access to
itself is only estimated with help of bayesian networks. In         unprivileged commands and calls. XPCOM (Cross
contrast the preliminaries in the work at hand are oppo-            Platform Component Object Model) [8] was created
sitional: here the goal is known exactly, because a quite           to allow for system calls from JavaScript and to make
rudimental language is being developed, with which it is            bridging between different programming languages
possible to address the goals directly. Against it: the path        possible. Via XPIDL (XPCOM Interface Definition
to the goal is a priori unknown and will be calculated not          Language) one firstly defines the interfaces of the
until runtime by a planning component.                              components to be implemented. This definition is
                                                                    platform independant. Secondly the components are
                                                                    implemented in an eligible programming language (in
1.3 Goals
                                                                    the work at hand C/ C++). The emerged components
                                                                    (XPCOM components) can then be accessed from
The work at hand pursues three goals:
                                                                    JavaScript by the so called XPConnect technique.
direct access customize functionality that is hidden be-
     hind menu structure                                         PDDL stands for Planning Domain Definition Lan-
                                                                    guage [9] [10]. In LISP-alike syntax one can rep-
visualization System should be put in a position to visual-         resent a section of the world (situation) with objects,
     ize system activity that it could also serve as an inter-      predicates and an initial situation. A goal situation de-
     active control and teaching tool.                              scribes the situation to be reached by a plan. So called
                                                                    planoperators (actions) allow for transitions from sit-
application in assistance systems The developed system              uations to situations. These enable a planner (here the
    shall constitute a library that can serve as a basis for        Fast-Forward Planner [11] is used) to find a possible
    assistance systems in context with the webbrowser               path (a plan) from the initial situation to the goal situ-
    Firefox.                                                        ation.
In conjunction with assistance systems the pbifSystem
should hereby meet the requirements from above (self-            3 The pbifSystem
explanation, usability and accessibility).
                                                                 Firefox is a pretty slim webbrowser. It contains just the
                                                                 code that is necessary to fulfill its tasks. In exchange ev-
2 Tools                                                          ery user can install additional features through so-called ex-
                                                                 tensions (XUL, CSS and JavaScript-Code) or components
Firefox and the work at hand use a couple of interfaces and
                                                                 (XPCOM-Code). Within the scope of the work at hand, the
programming languages. These shall be introduced here:
                                                                 pbifSystem [12] was developed, a Firefox extension that
XUL stands for XML User Interface Language [3]. With             can be used as library or basic service layer by assistance
   XUL you can describe GUIs in a platform indepen-              systems to control the webbrowser. An example for a co-
   dant manner. Similar to XML is the strict separation          operating assistance system is CONALD [13] developed at
   of structure and design. A rendering engine (Gecko            our chair. In the pbifSystem we first concentrated on the
   in Firefox) is then responsible to transform XUL tags         functionality of the main menu. To be able to demonstrate
   like ,  or  into                     the working pbifSystem without an assistance system, we
   graphical widgets like buttons, menus or scollbars.           developed a toolbar by name pbifBar (see Figure ??). It
                                                                 is meant to simulate function calls of an assistance sys-
CSS stands for Cascading Style Sheets [4]. CSS describes         tem. In a listbox the user can currently choose among two
   the design (style) of XUL elements that is interpreted        assistance modes namely pbifGuideMeByLabel and
   by Gecko, likewise. In the work at hand CSS is mainly         pbifGuideMeByUserGoal. The first case addresses
   employed to visualize system activity to the user (e.g.       goals by label names of the accordant menuitem while in
   currently selected menuitem by the system).                   the second case they are addressed by self-defined goal
names that have been communicated to the system in a
teaching mode before. The middle listbox holds the goal
name. With a click on the button Execute the system
starts to search for a valid plan and in case of success
returns an assistance sequence that is visualized and exe-
cuted.

           Figure 1. pbifBar: Toolbar in Firefox

3.1 Call for Assistance

3.1.1 Internal Program Cycle

Suppose, a user in a typical assistance situation wants to
enlarge the font of the currently displayed webpage. With-                 Figure 2. Process of a call for assistence
out an assistance system the user would have to execute the
click sequence Edit | Text Size | Increase.
Asking the pbifSystem to enlarge the font size would look
like this: pbifGuideMeByLabel() as the mode and                        (isOpen ?x)
Increase as the goal name. If for some reason the                      (clicked ?x))
menu holds more labels with the same name the pbifSys-
tem would choose the first occurrance (at the typesetting of        ( :action openMenu
                                                                     :parameters (?pMenu - MENU_T)
this paper, but it is easy to imagine, that a bunch of other
                                                                     :vars    (?lParent)
meaningful semantics is possible and can be implemented).            :precondition
Figure ?? shows the rough process of a typical call for as-            (and
sistance to the pbifSystem. In the phase of initialization              (isPARENTof ?lParent ?pMenu)
the pbifSystem traverses the main menu’s complete DOM-                  (isOpen ?lParent)
tree and transforms it to an internal representation which              (not (isOpen ?pMenu)))
can be translated to PDDL later on. This represents the              :effect (isOpen ?pMenu))
Firefox-Menuworld. If a user addresses a request for a goal
to the pbifSystem, it is passed to Firefox’s JavaScript inter-      ( :action clickMenuitem
preter and likewise translated to PDDL. Out of both merged           :parameters (?pItem - MITEM_T)
PDDL fragments a planning call is sent to the planner. That          :vars    (?lParent - MENU_T)
                                                                     :precondition
will return a valid plan for the planning problem in case of
                                                                       (and
success. The plan is a click sequence of menu elements.                 (isPARENTof ?lParent ?pItem)
This click sequence is returned to the JavaScript interpreter           (isOpen ?lParent)
that in turn executes and visualizes it. Visualization is real-         (not (clicked ?pItem)))
ized through CSS in doing short-time changes of the back-            :effect
ground color of the active elements (blinking).                        (and
                                                                        (clicked ?pItem)
                                                                        (not (isOpen ?lParent)))))
3.1.2 Description of the Planning Domain
                                                                         Listing 1. Firefox menu domain PDDL code
Listing 1 shows an extract of the Firefox-Menu domain
PDDL code.
(define (domain domain0)                                               Predicates and planoperators (actions) have been
 ( :types MENUBAR_T MENU_T MITEM_T)                               hardcoded into the pbifSystem and are constant all over
                                                                  runtime. They describe the semantics of Firefox’s menu-
  ( :predicates                                                   world. A menu only allows for opening if its parent is
    (isPARENTof ?pParent ?pChild)                                 opened and is closed itself. These are the preconditions to
execute the action. The postcondition (effect) of this plan-     plan returned by the planner can directly be executed by
operator is the opened menu (which is what we expected).         the JavaScript method eval().
The operator clickMenuitem is described similarly.
                                                                 4 Results
3.1.3 Description of the Planning Problem
                                                                 In this work we showed on the basis of the webbrowser
In contrast, objects and the predicates’ values are calculated   Firefox that a planbased approach is helpful and promising
during runtime in the phase of initialization and can there-     in the development and application of assistance systems.
fore change before each planning phase. As mentioned             UC for the console based UNIX and the pbifSystem for the
above, the complete DOM-tree is traversed and translated         GUI oriented Firefox showed, that the concept can be trans-
to PDDL for this purpose. An extract is shown in Listing         fered to other programs. Preconditions like read access
2.                                                               from and control access of the program must be met. Fire-
(define (problem problem0)                                       fox in particular allows for extension and generalization to
(:domain domain0)                                                dialog windows or webpage content. For that purpose it
(:objects                                                        is needed to write more and extensive planoperators. The
main-menubar - MBAR_T ; Root                                     pbifSystem offers exemplarily two ”intelligent” operation
pbifID_0 - MENU_T ; File                                         modes, a sensitive one (pbifGuideMeByLabel() and
pbifID_1 - MITEM_T ; New Window
                                                                 an adaptive one (pbifGuideMeByUserGoal()). The
pbifID_2 - MITEM_T ; New Tab
                                                                 pbifSystem empowers assistance systems to control the
pbifID_12 - MITEM_T ; Print...
pbifID_15 - MITEM_T ; Quit                                       webbrowser Firefox and allows for building more ”higher
pbifID_16 - MENU_T ; Edit                                        intelligent” service layers.
pbifID_17 - MITEM_T ; Undo
pbifID_18 - MITEM_T ; Redo
       ...
                                                                 References
)
                                                                  [1] R. Wilensky, D. N. Chin, M. Luria, J. H. Martin,
(:init                                                                J. Mayfield, and D. Wu, “The berkeley unix con-
(isOpen main-menubar)                                                 sultant project.,” Computational Linguistics, vol. 14,
(isPARENTof main-menubar pbifID_0)                                    no. 3, pp. 35–84, 1988.
(isPARENTof pbifID_0 pbifID_1)
(isPARENTof pbifID_0 pbifID_2)                                    [2] D. Heckerman and E. Horvitz, “Inferring informa-
      ...                                                             tional goals from free-text queries: A bayesian ap-
(isPARENTof pbifID_0 pbifID_15)                                       proach,” 1998.
(isPARENTof main-menubar pbifID_16)
(isPARENTof pbifID_16 pbifID_17)                                  [3] unknown author, XULPlanet, “Xul reference,” 1999.
      ...                                                             http://www.xulplanet.com/references/, 2006-04-19
)                                                                     12:27.

(:goal (clicked pbifID_38))                                       [4] H. W. Lie and B. Bos, “Cascading style sheets, level
)                                                                     1,” 1999.
                                                                      http://www.w3.org/TR/REC-CSS1,          2006-04-19
       Listing 2. Firefox menu problem PDDL code                      12:23.

Each object in the Firefox-Menuworld is associated with           [5] D. Flanagan, JavaScript. Cambridge: O’Reilly, 1998.
an identification number named pbifID to be able to ref-
                                                                  [6] T. Pixley, “Document object model (dom) level 2
erence it without mix-ups. It was not possible to use the
                                                                      events specification,” 2000.
label names instead because they are not unique necessar-
                                                                      http://www.w3.org/TR/2000/REC-DOM-Level-2-
ily. Moreover they allow symbols in their strings (Unicode)
                                                                      Events-20001113, 2006-04-11 10:04.
not defined for the PDDL syntax. The pbifID is gener-
ated automatically during reading the DOM-tree. After-            [7] L. Wood et al., “Document object model (dom) level
wards the menu’s, submenu’s and menuitem’s parent-child               1 specification (second edition),” 2000.
relationship is mapped into the planning world through                http://www.w3.org/TR/2000/WD-DOM-Level-1-
the isPARENTof predicate. Translating the goal request                20000929/, 2006-04-11 10:03.
to PDDL forms the last step. All this comprises a com-
plete planning problem in a planning domain and is sent to        [8] D.      Turner   and     I.    Oeschger,     “Cre-
the planner. The list (sequence) consists of planoperators            ating       xpcom       components,”         2003.
whose names were already modeled according to the ap-                 http://www.mozilla.org/projects/xpcom/book/,
propriate JavaScript function names. Because of this, the             2006-04-19 12:54.
[9] D. McDermott, M. Ghallab, and A. Howe, “Pddl -
     the planning domain definition language,” tech. rep.,
     AIPS’98 Planning Competition Committee, Oktober
     1998. Version 1.2.
[10] A. Gerevini et al., “Plan constraints and preferences
     in pddl3,” tech. rep., Department of Electronics for
     Automation, University of Brescia, Italy, August
     2005.
[11] J. Hoffmann, “Ff: The fast-forward planning system,”
     The AI Magazine, 2001.

[12] T. Bertz, “Planbasierte Benutzerführung im Web-
     browser Firefox,” studienarbeit, Universität Erlangen-
     Nürnberg, May 2006.
[13] M. Klarner, Hybride, pragmatisch eingebettete Re-
     alisierung mittels Bottum-Up-Generierung in einem
     natürlichsprachlichen Dialogsystem. PhD thesis,
     Universität Erlangen-Nürnberg, 2005.
You can also read