Visualisation of the Java Runtime Environment

Page created by Cecil Brooks
 
CONTINUE READING
Visualisation of the Java Runtime Environment
Visualisation of the Java Runtime
                 Environment

                                      by

                    Martin Alfred Henschke, BSc.
         School of Computer Science, Engineering and Mathematics,
                     Faculty of Science and Engineering
            Supervisors: Mr. Graham Bignell, A/Prof Paul Calder

                            November 19th, 2010

 Submitted in partial fulfillment of the requirements of the Honours degree of
   Bachelor of Science in Computer Science at Flinders University of South
                                    Australia

Adelaide, South Australia, 2010
Contents

Declaration of Originality                                                          vi

Abstract                                                                           vii

Acknowledgements                                                                   viii

Introduction                                                                         1

Literacy Review                                                                      4
   Algorithm Animators . . . . . . . . . . . . . . . . . . . . . . . . . . . .       4
   Program Visualizations . . . . . . . . . . . . . . . . . . . . . . . . . . .      4
   Visual Programming Environments . . . . . . . . . . . . . . . . . . . .           6
   Effectiveness of Visualizations . . . . . . . . . . . . . . . . . . . . . . .     7
   Java Virtual Machine Architecture . . . . . . . . . . . . . . . . . . . .         9

Problem Statement                                                                   10

1 Project Summary                                                                   12
   1.1   Project Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     12
   1.2   JVM as an information source . . . . . . . . . . . . . . . . . . . .       12
   1.3   Visualisation as an information medium . . . . . . . . . . . . . .         13
   1.4   Dynamic Visualisation . . . . . . . . . . . . . . . . . . . . . . . .      14
   1.5   Experimental Procedure . . . . . . . . . . . . . . . . . . . . . . .       15

2 Java Virtual Machine Features                                                     16
   2.1   Object Heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    16
   2.2   Frame Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    18

                                         ii
3 Visualisation Outline                                                             19
  3.1   Object Reference View . . . . . . . . . . . . . . . . . . . . . . . .       19
        3.1.1   The Main Node . . . . . . . . . . . . . . . . . . . . . . . .       20
        3.1.2   Color Coding . . . . . . . . . . . . . . . . . . . . . . . . .      21
        3.1.3   Querying Nodes . . . . . . . . . . . . . . . . . . . . . . . .      21
        3.1.4   Navigation of Tree . . . . . . . . . . . . . . . . . . . . . .      22
        3.1.5   Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    24
        3.1.6   Multiple References . . . . . . . . . . . . . . . . . . . . . .     25
        3.1.7   Looping references . . . . . . . . . . . . . . . . . . . . . .      26
  3.2   Frame Stack Visualisation . . . . . . . . . . . . . . . . . . . . . .       27
        3.2.1   Local Variables . . . . . . . . . . . . . . . . . . . . . . . .     27
        3.2.2   Source Code View      . . . . . . . . . . . . . . . . . . . . . .   28
        3.2.3   Time Controls . . . . . . . . . . . . . . . . . . . . . . . . .     29

4 Implementation                                                                    31
  4.1   Initializing TraneViz . . . . . . . . . . . . . . . . . . . . . . . . .     32
  4.2   Data Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      32
        4.2.1   Heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    32
        4.2.2   Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   33
  4.3   Display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   33
        4.3.1   Time Controls . . . . . . . . . . . . . . . . . . . . . . . . .     33
  4.4   Agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     34

5 Experimental Design                                                               35
  5.1   Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    35
  5.2   Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    36
  5.3   Materials   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   36
        5.3.1   Workshop Content . . . . . . . . . . . . . . . . . . . . . .        37
        5.3.2   Testing Materials . . . . . . . . . . . . . . . . . . . . . . .     38
  5.4   Expected Outcome . . . . . . . . . . . . . . . . . . . . . . . . . .        39

                                         iii
6 Academic Survey                                                                   40
  6.1   Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      40
  6.2   Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   40
  6.3   Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    42

7 Conclusion                                                                        43
  7.1   Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     43
        7.1.1   Low-level Representation of JRE . . . . . . . . . . . . . .         43
        7.1.2   Implementation of Dynamic Visualisation         . . . . . . . . .   44
  7.2   Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     44
        7.2.1   Limitations of User Interface . . . . . . . . . . . . . . . . .     44
        7.2.2   Additions to Visualisation Features . . . . . . . . . . . . .       46
        7.2.3   Experiment Execution . . . . . . . . . . . . . . . . . . . .        46
  7.3   Research Implications . . . . . . . . . . . . . . . . . . . . . . . . .     47

Bibliography                                                                        48

A Academic Survey Questionnaire                                                     51

                                         iv
List of Figures

 3.1   An application visualized in the Object Reference View . . . . . .          20
 3.2   An Object Reference Visualisation accompanied by the Color Legend 22
 3.3   A selected node displaying its instance variables in the Object Ref-
       erence Visualisation . . . . . . . . . . . . . . . . . . . . . . . . . .    23
 3.4   An Object Reference Visualisation with Side Bar displaying depth            24
 3.5   Arrays in the Object Reference view . . . . . . . . . . . . . . . .         25
 3.6   An example of multiple referencing in the Object Reference view.
       The node highlighted in green points to the same object as the
       selected node. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    26
 3.7   A program’s execution trace shown in the Frame Stack Visualisation 28
 3.8   A selected frame and it’s local variables in the Frame Stack Visu-
       alisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   29
 3.9   The active frame highlighting the line of code it points to in the
       Frame Stack Visualisation . . . . . . . . . . . . . . . . . . . . . .       30

 4.1   The System Architecture for the TraneViz . . . . . . . . . . . . .          31

                                        v
Declaration of Originality

I certify that this thesis does not incorporate without acknowledgement any ma-
terial previously submitted for a degree or diploma in any university; and that to
the best of my knowledge and belief it does not contain any material previously
published or written by another person except where due reference is made in the
text.

Signed                              Dated

Martin Alfred Henschke

                                        vi
Abstract

Visualisation used as a demonstration tool, in the teaching of object-oriented con-
cepts in computer science, exist both as standalone tools used in conjunction with
running applications and integrated development environments. Although many
of these tools have merit in teaching high-level conceptual information, issues
can arise when students make the transition from the use of these tools to more
advanced or low-level programming tools and techniques, as information about
program functionality is typically abstracted out of the visualisations presented
in these tools.
    This project attempts to bring the benefits of higher level comprehension and
engagement inherent with the use of visual teaching tools while providing a low-
level perspective of the functionality of a program’s runtime environment. This
application, TraneViz, is designed for use in developing programs written in the
Java programming language. The visualisation provides information about the
JRE’s two most critical features, the frame stack and the object heap, while
providing an additional layer of abstraction to demonstrate other features such
as the reference relationship between variables and objects, variable assignments
and the flow of execution.
    The visualisation developed is aimed at aiding students’ understanding of
more complex object-oriented concepts at a first or second year level. The appli-
cation has been implemented to support a single-threaded, small scale Java pro-
gram. A thorough plan for testing and assessment resources has been developed
to determine if the application is a viable alternative to traditional non-visual
methods of teaching, but the experiment had to be simplified to use academic
staff and students in place of beginner programming students due to a lack of
volunteers.
    This project demonstrates that the Java Runtime Environment can be rep-
resented visually, and whether or not the visualisation has positive learning out-
comes for students. On a broader level, it provides a study on the use of visu-
alisation in a teaching context, and whether this approach has merits in other
scientific applications.

                                        vii
Acknowledgements

I wish to acknowledge my supervisors for their assistance and encouragement
in the preparation of this thesis: My primary supervisor, Mr. Graham Bignell
aided in the early development of the project plan and research, and my assistant
supervisor A/Prof Paul Calder in its implementation. I would also like to thank
Mr. Brett Wilkinson for acting as a mediator between myself and the students I
had intended to use in testing my visualisation, and Dr. Carl Mooney in aiding
in the development of an experimental plan.
    Finally, I would like to thank all the staff and students at Flinders University
who participated in the academic survey, who have accomodated my research
and provided feedback on possible applications and improvements to the visuali-
sation.

                                        viii
Introduction

As object-oriented languages have become more popular as the first language
taught to programming students, visualisations designed to demonstrate object-
oriented features of these languages have become increasingly popular as teaching
tools. Applications such as BlueJ and Alice are now standards in many schools
around the world as tools used to first teach students about the basic principles of
object orientation. These visualisations can vary broadly in scope; some provide
a simple demonstration of a particular aspect of a language while others provide a
fully integrated development environment. Many also provide additional levels of
abstraction on top of the subject matter, to hide more advanced and complicated
language features not considered relevant to teaching the broader principles of
OO programming.
    These forms of visualisation support a high-level ‘objects first’ approach to
teaching programming [Cooper et al., 2003], where the broad features of the data
organisation structure of OO languages are taught to students initially, followed
by programming structure and basic syntactic features. More advanced language
features are typically not taught to students until they have gained a firm grasp
of the paradigm; once this stage has been reached students make the transition
to learning more traditional procedural programming techniques and how these
relate to the object-oriented paradigm. Visualisations usually support only the
first stage of this method of teaching, providing an abstract representation of
objects and data structures while hiding some of the more complicated features of
the language usually reserved for later years of study. Once this point is reached,
students are advised to program in a different environment, either through the
use of a conventional IDE or by the command line.
    While this method of teaching does provide a strong understanding of the
OO paradigm, it can have negative consequences in later years of study. The
visualisation tools used to initially teach these early aspects of object orientation
have been observed to be used by students well beyond the scope of their use as
teaching tools by the course providers. As these tools typically hide information
about the more complicated language features and lack many of the more sophis-
ticated features of an IDE, the use of these tools outside the context of teaching
high-level OO features can restrict student learning outcomes [Kölling, 2008]. An

                                         1
over-reliance of these tools can make the transition to a command line environ-
ment or to an IDE challenging for students. Tools that hide information such as
access to the main method or syntactic features can severely hamper a student’s
ability to pick up the language with all of its features included in a different en-
vironment. These problems can be linked back to the high level of abstraction
typically seen in visualisation tools, which contribute to hiding information about
how a programming language functions.
    This project attempts to provide a solution to this issue by providing a vi-
sualisation with a significantly different scope to help teach students about the
more complicated low-level features of the Java programming language, while re-
taining a simple interface that is facilitative to student programming at a first
or second year level. This approach is based on the observation that many is-
sues with these visualisations lie in the abstraction of low-level information. The
visualisation produced by this project will act as a compliment in teaching specif-
ically about these features. The application developed as the main aspect of this
project acts as an abstract visualisation of the Java Runtime Environment, or
JRE. Many of the core features of the Java programming language can be di-
rectly observed and are governed by the JRE. The visualisation developed takes
advantage of this fact and uses this information to demonstrate some of the key
features of how the Java programming language functions.
    The visualisation is divided into two main sections. The first section provides
a visual representation of the object heap, which stores each object that has
been constructed in the process of the application. Each object contains state
information and pointers to other objects stored in the heap. The visualisation
takes advantage of the pointer structure of referencing in the heap, and provides
a hierarchical view of how objects reference one another. This visualisation is
populated by a series of nodes organized in a tree-like structure, with each node
representing an object variable pointing to another object. A connection between
nodes indicates the parent node holds a reference to the child node. The resultant
structure allows students to view the structure of pointers and references in their
Java application, and allow them to develop an understanding about variable
access and pointers without requiring a strong understanding of the material
prior to observing the visualisation.
    The second section of the visualisation is a direct representation of the frame
stack stored in the main thread of execution. The visualisation does not support
multi-threading, so only the main thread’s stack is observable. In the stack, the
JRE stores a series of frames that point to each method currently in the process
of being executed. Frames are added and removed as methods are called and
completed. At each point in an application’s execution, students are able to
observe each frame in the stack, and the value of local variables stored in each

                                         2
of the frames. The way the frame stack is presented is very similar to how many
IDE and debuggers present execution history when reporting exceptions, and so
should be familiar and easy to understand by most students. It also allows them
to comprehend what each aspect of their application does, how it affects the flow
of execution and object states, as well as demonstrating in detail many other
features of the language typically taken for granted, such as the initialization
of field variables outside of the constructor and the passing of parameters to
methods. Finally it gives a detailed view of local variables and differentiates
them from field variables, helping students understand how the scope of methods
affects the accessibility of data.
    To determine the effectiveness of the visualisation as a teaching tool for first
and second year computer science students, an experimental plan has been de-
veloped, aimed at specifically determining if there are any measurable benefits
to learning with this tool, and if so which method of interaction with the tool
provides the greatest benefits. Students are tested through a two hour tutorial,
where they are taught information considered to be relevant to the learning out-
comes of the visualisation. The experiment consists of three major test groups-
the first has access to the visualisation as a demonstration tool, and are only
able to watch programs as they execute. The second group has full access to the
visualisation as a development tool and are able to make modifications to the
application and observe the results of their changes. The third and final group
act as the control, and are taught the same material but do not have access to
the visualisation. Each experimental group is administered two written tests,
one before the tutorial and one after, which they must complete to determine
any changes in their understanding of the material.
    Due to a lack of participants for the experiment, a shorter survey involv-
ing honours students PhD students and academic staff was taken to determine
whether or not the visualisation was percieved as a viable teaching tool. The sur-
vey involved a brief 15 minute demonstration of the visualisation’s features and
capabilities, followed by a short written survey including a quantitative analysis
of the visualisation’s accessibility and usefulness, as well as several short-answer
questions regarding the strongest and weakest aspects of the visualisation, and
its possible applications.

                                         3
Literacy Review

Algorithm Animators

The use of visualization in teaching Computer Science has its roots in algorithm
animators, which provide a visual representation of the function of algorithms.
Traditional algorithm animators are standalone computer applications that take
in user specified data and produce an animation to accompany the output. Cur-
rently, algorithm animations are mostly used in teaching broad computer science
concepts, including searching, sorting and recursive algorithms.
    One such system is XTango [Stasko, 1992], an algorithm animator that al-
lows users to develop custom animations for algorithms written in C. The system
provides editing tools that allow the representation of variables and data struc-
tures as shapes, and also provide a means of manipulating their position, color
or size as their state changes. Fluid animations are used to represent changes
in state; discrete ‘frames’ of each step in an algorithm make precisely detecting
changes difficult, where an animation draws the users attention to a specific part
of the visualization and allows them to observe how it changes. As the system
allows custom animations to be made, it can serve as teaching tool for lecturers
to demonstrate how particular algorithms operate.

Program Visualizations

Visualization tools became more popular with the mainstream acceptance of the
object-oriented paradigm and the replacement of Pascal with Java as the most
commonly used teaching language in introductory Computer Science courses.
Object-oriented languages lend themselves well to visualizations, as objects can
be represented as individual entities that are created and modified over the life of
a process. The use of visualizations has become increasingly popular in teaching
approaches that have a focus on the data structure and the abstract concepts of
the language, due to their clear and effective representation of the OO paradigm,
which can be difficult to understand or efficiently utilize at first [Dershem and

                                         4
Vanderhyde, 1998, Machanick, 2007]. The varieties of visualization available to
aid in learning object-oriented languages can vary from a windowed application
accompanying a running program to a programming interface with visual com-
ponents, but each have the fundamental goal of providing a visual abstraction of
the problem space to make programming more accessible to students.
    The simplest of visualizations provide a view of one aspect of a running Java
application. Jacot is a tool developed to demonstrate the concurrent capabilities
of Java by providing a visualization of thread operation over the life of a program
[Leroux et al., 2003]. There are two parts to the visualization: a sequence diagram
which shows the execution history of each thread, and a thread state diagram
which shows the state of each created thread at a given point in time. Jacot
uses UML sequence diagrams for representing thread activity, because of their
ability to represent time and the interaction between objects through method
invocations. Thread state diagrams are represented as UML statechart diagrams,
as they can accurately depict each of the various states of a thread over its lifetime.
Jacot can be used to visualize concurrency issues such as starvation and deadlock,
but its close adherence to UML has been criticized as not being able to represent
some critical behaviors of threads in concurrent applications such as notification,
suspension and sleeping. This problem has been addressed by other more recent
visualizations, such as the lock-causality graph [Kim et al., 2009] which creates
a new graphing notation to better represent concurrency, and extensions to the
UML notation to supplement these shortcomings [Artho et al., 2007].
   JAVAVIS is another system that provides a separate visual interface, but its
focus is on object attributes and method calls rather than concurrency [Oechsle
and Schmitt, 2002]. JAVAVIS uses an object diagram to visualize a frame on
the JVM stack, providing information about the frame’s possessing object, local
variables and method arguments. A sequence diagram is also used, with the same
notation as a UML sequence diagram, to represent method calls between objects.
As the visualization continues, method calls are made and recorded as events on
the sequence diagram. During program execution different classes can be selected
and visualized. The goal of the tool is to provide a gestalt perspective on the
application’s function at runtime, to solidify concepts such as flow of execution.
The Jeliot family of visualizations have a similar goal and system to JAVAVIS,
but with a stronger focus on visualization as a learning tool [Moreno et al., 2004].
The Jeliot visualizations also show interactions between objects and variables,
but provide several different visualizations which students may select depending
on which is most relevant to what they want to understand about their program.

                                          5
Visual Programming Environments

In some Computer Science courses, programming is a secondary focus to higher
level concepts such as object orientation and algorithm function. This approach,
based on Bloom’s taxonomy, is based on recent research suggesting that pro-
gramming is not considered to be the lowest level of cognitive skills in computer
science [Machanick, 2007, Hitchner et al., 2001]. Particularly with lower level
languages such as C++, where students have access to system resources through
dynamic memory allocation or pointers, learning solely about the language’s func-
tion without proper understanding of the overarching structure and capabilities
can lead to poorly designed programs and systems. Learning to program at too
early a phase can be considered to have a negative impact on a student’s learning
outcomes in the field of Computer Science [Hitchner et al., 2001]. In response to
this concern, there exist several visualization tools that provide a programming
interface designed to take pressure off of the complex syntax and semantics of
Java and focus more heavily on the concepts that govern it.
    One of the earliest and most commonly used of these visualization interfaces
is BlueJ [Kölling et al., 2003], a visual programming interface that provides a
static visualization of the class structure of the application. BlueJ is designed
to provide an object-oriented program development environment in place of a
more conventional, procedurally structured environment. The system is designed
to remedy three major issues observed with programming environments that in-
terfere with understanding the OO paradigm: they are traditionally not object
oriented themselves, overly complex, and too heavily focused on user interface
development. BlueJ extends UML notation to produce a class diagram, which
displays class association, inheritance and polymorphism. Modifying the source
code of a given class requires finding its place in the hierarchy, providing an in-
formative display of the object-oriented concepts that govern a Java program as
the students writes their code [Van Haaster and Hagan, 2004]. Programmers are
also able to create instances of objects external to the main method of the appli-
cation, and individual object methods can be called by the programmer, adding
tangibility to data structures. A survey of the use of BlueJ as a teaching tool
found students performed better and were more comfortable with the concepts of
the OO paradigm when using the tool than those that used a text editor. BlueJ’s
heavy focus on object orientation over industry standard programming practice
have also been acknowledged by Kölling [2008] as making the transition from
BlueJ to an IDE or to command-line programming a difficult one however, which
can lead to an over reliance on the tools at inappropriate levels and restricted
learning outcomes.
   Several visualization tools that have been developed provide an interface to

                                        6
supplement or replace manual coding, foregoing the need to teach syntax in favor
of higher level concepts. JPie is a teaching tool that provides a programming
interface by dragging and dropping tags containing code instructions to form
statements [Goldman, 2004]. Calling methods or modifying variables is done by
finding the correct methods and variables from a list of tags and dragging and
dropping them into the method body. Programming is done entirely through
the visualisation, and requires no text modification; language features such as
setting method access and modifiers are done through the manipulation of the
interface rather than writing code. Another tool with a similar interface for
program development is Alice, where statements are written in pseudocode rather
than in Java syntax [Cooper et al., 2003]. Instead of a traditional programming
tool, Alice’s programming interface allows users to create animated 3D scenes.
Objects are placed in a 3D environment and can be manipulated using drag and
drop programming commands. Alice also supports the teaching of many more
complicated concepts such as recursion [Dann et al., 2001]- ordinarily a very
complicated concept that is difficult to visualize Alice’s tools allows students to
easily produce recursive methods and receive visual feedback on how they operate.

Effectiveness of Visualizations

The use of visualizations as a pedagogical tool is a dividing factor within the com-
puter science community. Advocates of visualization argue visual representations
provide an unique perspective on programming that aids student comprehension,
and provides a more interesting and engaging way to learn. Visualizations are
also argued to be more natural to programmers; Petre and Blackwell [1999] de-
termined in a study of industry experts that visual thinking is a large part of
the programming process, and programmers will often use or fabricate visual
metaphors to help them in understanding a problem. Despite this, visualization
tools have yet to become popular in mainstream teaching. A meta-study on the
effectiveness of algorithm visualizations revealed that despite most studies con-
cluding that visualisations are effective teaching tools, they remain unpopular
with universities as they not effective enough to justify the amount of time neces-
sary to become familiar with them [Hundhausen et al., 2002]. One of the primary
advantages of using a visualization tool is intuition- the abstract representation
should lead to an intuitive understanding of the subject. However visualizations
are not always a perfect mapping of the problem domain, and their interpretation
is a learned skill [Petre, 1995]. Despite the importance of interaction with these
tools, surprisingly little research has gone into the impact of interface design and
interactivity. Echoed in another study, it has been found that student interfac-
ing with these tools can be just as important as the visualization produced by

                                         7
the tools themselves. Lawrence et al. [1994] determined that students that ac-
tively engaged with algorithm animation tools by providing their own data and
manipulating the visualization had higher learning outcomes than students that
were provided with ready-made data. It was also determined that these bene-
fits seemed to have the greatest impact on deeper levels of comprehension, in
understanding the fundamental aspects of the algorithm itself.
    Abstract visualizations also have several inherent dangers. Petre [1995] dis-
cusses the heavy reliance images have on relative component positioning and
size (dubbed ‘secondary notation’) in conveying meaning. Secondary notation is
present in all visualizations, and implies the relevance of aspects of the image and
relationships between components. Well constructed visualizations use secondary
notation to improve clarity and simplicity of the diagram, but conversely images
that fail to recognize these subtle distinctions can imply relationships and levels
of relevance where they do not exist, making them misleading. The audience for
which the visualization is intended is also important- using a visual metaphor in
place of a concrete concept implies some level of information will be lost- what
information is lost and how the metaphor makes up for this shortcoming will of-
ten determine the purpose of the visualization itself. Visual metaphors designed
to explain very high level concepts, as many of the program visualizations are,
will often hide a large number of lower level concepts. These visualizations are
useful for understanding the concepts the image accounts for, but will be of little
use to programmers who need access to the hidden information. In the same
vein visualizations with many concepts covered in the visual metaphor may be
useful to skilled programmers, but will be too confusing for novices. The level
of abstraction, i.e. how “close” the visual metaphor is to the underlying concept
is also a major influencing factor in how useful the visualization is as a learning
tool. Scheiter et al. [2008] performed a study comparing realistic and abstract
biological visualizations of cell division, and the benefits each approach had to
students. It was determined that the most realistic representations (in this case
camera feeds from microscopes) had a negative impact on learning outcomes, and
students performed best when using abstract representations. Similarly, abstract
representations can shield students from certain lower level concepts, but when
it is necessary to learn these concepts it can make the bridging process more
difficult than it would have been otherwise [Kölling, 2008].
    In addition to secondary notation, many visualizations use animation of com-
ponents to draw the user’s attention to relevant aspects of the image. As most
algorithm animators and program visualizations are showing changes to the state
of a system, animation can be used to make specific changes more obvious and
their effects on the rest of the system clearer. Despite this, the use of anima-
tion in software visualization tools is also subject to some debate. In a survey

                                         8
on commercial programmers, it was found that animation and 3D visualizations
are the least important aspects of a visualization tool for this group [Bassil and
Keller, 2001]. Furthermore, analysis of the Jeliot programming tools found that
animations could often obstruct information and make the tool more difficult and
frustrating to use [Moreno et al., 2004]. A meta-survey conducted by Tversky
et al. [2002] on the use of animation found that they can be dangerous in some
applications as it can be misleading or distracting, and benefits are only seen in
systems where the animation conveys some additional meaning to that already
provided in the visualizations; it has a null or negative effect if used to reiterate
information that is already present or for cosmetic reasons.

Java Virtual Machine Architecture

As specified in Lindholm and Yellin [1994], the Java Virtual Machine architecture
is made up of a series of components that interpret class files and execute instruc-
tions. There are five major components that make up the JVM- the PC register,
which points to the currently executing method for each thread, a stack unique
to each JVM thread that stores frames and traces thread execution throughout
the application, the heap accessible to all threads that is used by references to
directly access data objects, the method area which contains loaded class meth-
ods analogous to compiled code and used for thread executions and the constants
pool which keeps a list of all final variables for each object. These five major com-
ponents are used by the main thread of execution and all instantiated threads to
access instructions and update the program’s state. The JVM state is accessible
through the Java Platform Debugging Architecture (JDPA), a multi-tiered inter-
face that has direct access to JVM execution through the use of agents [jdk, 2010].
It is made up of three levels: the Java Virtual Machine Tools Interface (JVMTI),
a native-code interface that operates through the JDI to access VM state infor-
mation directly, the Java Debug Wire Protocol (JDWP) which specifies the way
in which debuggers interact with the JVM and the Java Debug Interface (JDI),
a Java-side interface that accesses state information from the VM. The JVMTI
is typically used for accessing information specifically about the functionality of
the VM and is recommended only for debugging very low-level program features
or the VM itself, while the JDWP/JDI are designed to produce debugging tools
for use with other Java applications.

                                         9
Problem Statement

The goal of this project is to provide a solution to some of the major issues
that exist with the currently available visualisations. At present a large host of
visualisations exist that have beneficial results on students’ understanding of the
OO paradigm. However many visualisations used as teaching tools have been
acknowledged to also have negative effects on student learning. These relate to
the method in which the visualisations are presented and the information they
provide to students.
    Most high-level visualisations demand a level of abstraction to present their
information clearly and cohesively without confusing the users with technical de-
tail. Although this does help to make the visualisations more approachable to
beginning students, the method in which this information is hidden makes it more
difficult to comprehend when it is necessary to access at a later date. This infor-
mation typically includes lower-level features of programming languages- in the
context of Java this can include the method by which variables reference objects,
the storage of field and local variables, the main method and its relationship with
the rest of the program, the method of execution and in some systems even basic
syntactic language features. If this information is not sufficiently covered by other
learning sources, it can lead to a poor understanding of language functionality
and the inability to write functional applications on a larger or more intricate
scale.
    Visualisations have also been noted to make the transition to different devel-
opment environments more difficult. Integrated development environments such
as Alice or BlueJ successfully make programming easier by removing complicated
features of language such as main methods, structuring of class files or detailed
syntax, but as a consequence can make the transition to another environment
that does not automatically handle these features more difficult. Students have
been reported to continue using inappropriate development tools at much later
stages in their education, which can act as a detriment to their ability to compre-
hend presented material- without an understanding of the basic mechanics of the
language, proper understanding and effective utilization of advanced language
features is impossible.

                                         10
This project will approach this problem by providing a similar programming
tool with a focus on information provided by the Java Runtime Environment, and
determine whether or not a similar programming tool can be developed to be used
alongside these visualisations that allow students to better understand the lower-
level features of Java. The project must also determine if such an approach will
have positive learning outcomes for students, and if it is successful in solving or
alleviating the issues associated with other visualisation tools.

                                        11
Chapter 1

Project Summary

1.1     Project Aims

The project has the overarching goal of providing a solution to the problem
statement by developing a piece of software that would be used to help students
understand some of the lower level features of the Java programming language.
The software would be primarily aimed at a beginner programming audience as
a demonstration tool rather than as a software-development tool, but provide
sufficient flexibility for experimentation. The application is also intended to be
used with running Java applications- the tool should be able to take a program
as an input and use it to generate the relevant information.
    The scope of this project will encompass the development of a software tool
that is capable of producing information representing the low-level functionality
for any given Java program. It will also encompass the testing the efficacy of this
software with beginner-level programmer students to determine if the software
improves their understanding of the Java language and the way Java programs
work.

1.2     JVM as an information source

The source of information for the software will be the Java Virtual Machine,
the platform on which all Java applications run. When the project software is
executed, it will start a new instance of the JVM running the demonstration
program, and access all information about the state of the program from the
JVM. The JVM retains all runtime information about the program it is currently
running, including the state and value of each of the objects that have been
constructed and the state of each currently executing thread.

                                        12
Information regarding the state of JVM can be accessed using the Java Plat-
form Debugging Architecture, or JDPA. The JDPA defines a protocol for access-
ing each of the various attributes the JVM stores as well as some basic abilities to
control its execution such as requesting sleep and notify events on active threads.
The JDPA is divided into two main levels- a Java-side level that accesses infor-
mation about the program currently running on the JVM, and a C-side agent
layer that is able to interface with the JVM directly and access all information
regarding it’s state and execution. Support for these interfaces is typical in most
virtual machine implementations, including Oracle’s HotSpot VM which is the
standard used for most Java installations.
    The JVM was chosen as the source of information for the application as it
provides a large amount of runtime information about a Java application while
retaining a low-level focus. As an agent constructed to operate on the JVM is
independent of either the platform or the application, it allows the software to be
highly flexible in its application. The low-level perspective of the JVM combined
with the wealth of information it provides also allows for abstraction to be used
to remove information considered irrelevant, and provide a stronger focus on the
aspects of the application considered most crucial to the understanding of the
underlying features of the Java language.

1.3      Visualisation as an information medium

One of the main tasks of the application will be the implementation of a visu-
alisation that will use the information extracted from the JVM. The goal of the
visualisation is to provide an easy-to-understand visual perspective of the running
application, and giving the user the ability to control what information they wish
to display through a visual interface.
    Visualisation was chosen as the main medium for presented information to
exploit the beneficial learning features associated with the use of visuals, while
retaining the depth necessary to present the information from the JVM. Visu-
alisations have been shown to improve student learning outcomes when used to
represent concepts that lend themselves well to a visual representation, and those
that make effective use of diagrammatic secondary notation can improve compre-
hension of material [Petre, 1995]. Well designed visualisations can also provide
a highly detailed view of a particular area of a problem, or a broad, overarching
gestalt perspective. This allows students to either view a problem as a whole or
find a particular aspect of problem in the visualisation and extract more detailed
information about it from that perspective. In the problem space of a program ex-
ecuting in runtime, with a potentially large number of objects or runtime features

                                        13
to visualize, the ability to have both a broad and specific perspective available is
highly desirable.
    Visualisations are also considered to be more interesting and comprehensible
than a flat textual representation. With the large amount of runtime information
available from the JVM, simply viewing this information from a textual perspec-
tive alone could make the location of specific information difficult and be less ben-
eficial as a teaching tool. Organizing information visually makes relevant infor-
mation not only easier to locate, but more interesting and captivating for the user.
A visual representation of running Java applications is relatively common among
both visualisations and other teaching materials, especially through the use of
UML notation to demonstrate either program execution [Leroux et al., 2003]
or class structure [Van Haaster and Hagan, 2004, Riley, 2006]. The widespread
use of these applications demonstrates the effectiveness of using visualisations to
represent object-orientation visually.
    The visualisation should be able to represent the object-oriented features of
an application and provide sufficient depth to allow comprehension of some of the
more complicated features of the JVM. Due to the JVM’s high level of complex-
ity, a direct visual representation would not be beneficial as a learning tool for
beginning programmers- a level of abstraction over the architecture will ensure
information provided is relevant and easy to comprehend while maintaining a
good level of depth A detailed user interface should also allow the scope of the
visualisation to change from an overarching perspective of the entire application
to a more precise, detailed view of individual components. The visual interface
should also be simple, tidy and consistent with the features it is intended to
demonstrate.

1.4      Dynamic Visualisation

This application will be implementing a dynamic visualisation rather than a static
one; this means that as the state of the application changes, the visualisation
will update to demonstrate these changes in real time. A dynamic visualisation
was chosen to be implemented for this project as it provides significantly better
feedback on a program than a static one would in a similar context. This also adds
a level of complexity to the application, as it demands the virtual machine be run
in real time alongside the visualisation rather than separately, and measures must
be taken to ensure it is properly controlled by the user without compromising the
integrity of the application.
   For the application to run dynamically, the VM will be running in parallel
alongside the application. This requires the visualisation to be able to start and

                                         14
stop the VM at particular intervals. This approach was chosen over the alternate
possibility of running the application before the visualisation and generating a
profile with all information about the VM stored within it. Generating a profile
has the advantage allowing users to move forward and backwards in execution,
but runtime errors would be detected before the profile completes whereas in a
real-time running VM, a runtime error is detected once the visualisation reaches
it in execution. This allows students to witness exactly where it occurred, and
monitor the state of the application before it occurred. Another benefit of live
execution is other possible errors that cannot be detected easily, such as infinite
loops. A profile cannot recreate an infinite loop as effectively as running the
program in real time, as the error can be observed as the program is executing.

1.5      Experimental Procedure

Once the visualisation has been completed, to determine whether or not it will
be effective as a teaching tool an experimental procedure will be developed. The
experiment will use volunteers from the target audience, namely first or second
year programming students, and test the effect the visualisation has on their
understanding of programs written in Java, as well as the Java language itself. It
should include presenting low-level material about the Java language to students,
with one of the student groups having access to the visualisation and a control
group that does not, to determine if the first group has a better understanding
by the end of the experiment.
    Execution of the experiment is an optional aspect of the project as time con-
straints as well as the complexity of the software necessary for the experiment to
go ahead are considered to make any formal experimentation unfeasible. How-
ever, plans for such an experiment as a possible segment for future work on this
project make the development of an experimental plan a desirable component.
The ability to formally test the capabilities and value of the application will pro-
vide insight into whether the visualisation is an effective learning tool, as well
as whether or not a low-level approach such as the one employed in this project
has merit in practical teaching applications. These conclusions are necessary to
verify the validity of the approach, and the project in general.

                                        15
Chapter 2

Java Virtual Machine Features

As the visualisation takes its information directly from the Java Virtual Machine,
and is intended to demonstrate runtime behavior of a program, the structure of
the visualisation very closely follows the model used for the JVM itself. The two
main features of the JVM, the object heap and the frame stack are used as the
basis for the two main views of the visualisation. Abstraction has been used in
devising the visualisation to hide information considered irrelevant to the purpose
of the application and to ensure the information presented is as clear as possible.

2.1      Object Heap

The main memory storage aspect of the JVM is the object heap, where all declared
objects are stored and referenced. The heap has no organization or structure,
but each object is dynamically allocated the amount of memory necessary for
encapsulation, and their memory locations are used by other objects and methods
to access their individual components.
    A typical Java application stores a series of objects over the duration of its
execution. Due to the tight integration of the OO paradigm into Java, all ob-
jects exist as references to another object with the exception of the main class,
which instead stores objects as local variables of its method or static fields of its
containing class. All objects in Java are stored in memory and are inaccessible
for access by other objects unless they hold a reference, either a field or local
variable, that points to the object’s location.
    Field and local variables function in Java in a very similar fashion to pointers
in C. Each one holds a memory location for the object it refers to, and using
the ’.’ operand in conjunction with a variable allows access to the object the
reference points to.

                                         16
In this form of memory structure, each object is intrinsically linked to one
another through a hierarchical structure. Each object holds a series of variables
that point to other objects. At the beginning of every Java application, in the
main method a series of objects are defined and held as references, which in turn
create more objects and store them as references. Each object created in a Java
application can be traced back to the main method, and the initialisation of the
main method’s containing class.
    This hierarchical structure forms the basis for the first component of the visual
model. The object heap, although itself without an implemented organization
structure, is organized by the way in which objects create and reference one
another inherent with Java’s mechanism of execution. At the bottom of this
structure are the variables accessible to the main method, with every other Java
object that has been created and stored able to trace its containing objects’
references back to the main node. This creates a tree-like structure, which can
be represented visually.
    There are some features of the Java language that must be taken as special
cases to be accounted for in this model. Primitive values in Java such as integers
and bytes are not stored in the same way objects are- references to primitives
contain the value of the primitive itself, rather than a pointer to an address in
memory where the primitive is stored. As a consequence, primitives do not appear
as nodes on the tree. Static variables among classes also break the paradigm as
a static variable does not belong to any one object. It is considered that each
object instead has a reference to the same variable, and as a consequence the
static variable of a class appears as a reference to each object of the class, with
each reference pointing to the same object.
   In addition to this, there are several other possible mechanisms that can be
taken in the Java language that break this model. These are each handled indi-
vidually and discussed in the visualisation section
    This component of the model is designed to help students better understand
how Java handles object references and its pointer-based structure. Making a
modification to a object through a single reference will influence all references
that point to that object, which can cause unexpected results in a program if not
properly catered for- this model demonstrates when this will occur, and to what
effect. It also allows students to observe the structure of their applications from
a gestalt perspective and better understand what objects exist and their state at
different points in execution.

                                         17
2.2      Frame Stack

The second feature considered relevant to this application is the frame stack
stored in the JVM. For every active thread in an application, the JVM stores
a frame stack that measures its current position in execution. The frame stack
consists of a series of frames that demonstrate each method a Java program is in
the process of calling. Every time a new method is called, a frame is added to the
stack that stores information about the exact location of the execution and the
state of local variables within that method. Frames are removed when a method
completes its execution.
    The visualisation developed has a relatively literal interpretation of the frame
stack, and visualises it as a series of ‘blocks’ that are stacked on top of one
another. Each block contains a list of information found in the frame, namely
the containing object’s type, the method, line number and a list of all local
variables contained in the frame. The frame stack visualisation is designed to
be completely compatible with the heap visualisation, and any local variables
that are displayed in the frame stack should be viewable on the object tree. As
with the frame stack in the JVM, frames are added and removed as methods are
entered and exited. As it is not intended for this visualisation to support multiple
threads of execution, only one frame stack, that belongs to the main stream of
execution, will be present.
    The frame stack allows students to understand a number of features regarding
the method in which Java executes code. The linear progression can be observed
as the frame stack updates from one line to the next, and points where it devi-
ates from a linear execution, such as to initialize field variables that have been
declared and assigned outside of the constructor, will be viewable. It also makes
an important distinction between field variables and local variables- where field
variables are always accessible and have technically no scope of accessibility out-
side of Java’s imposed member access structure, local variables are only accessible
within the method that they are declared in, and reference to them is removed
as soon as the method terminates. It will also demonstrate scoping within meth-
ods, with local variables declared within conditional statements or loops being
dereferenced once the frame has moved outside of that area of execution. Finally,
as it allows the inspection of the value of variables, it may help with some very
rudimentary debugging processes, such as detecting infinite loops.

                                        18
Chapter 3

Visualisation Outline

The JVM visualisation developed for this project, TraneViz, is capable of compil-
ing and launching a Java program’s source code input and producing a dynamic
runtime visualisation of the state of the application. The dynamic visualisation
generated by TraneViz uses information taken directly from the runtime state of
the VM used by the embedded application. The application demonstrates many
of the Java programming languages features and how they apply to individual
applications.

3.1     Object Reference View

The object reference view takes its information directly from the JVM heap, but
the subject of the visualisation is the relationship between objects and variables.
The diagram forms a hierarchical tree-like structure, consistent of a series of
nodes. Each node represents an instance variable held by an object, and each
node is labeled with the variable name. The root of each tree is in the center of
the diagram, with each sub node forming a circular shape around it. A connection
between nodes indicates that the child node, the node farthest from the center of
the circle is an instance variable owned by the object the parent node is pointing
to.
    Figure 3.1 demonstrates a simple example of a series of objects connected
by references. The center of the tree, bookcase, is a variable pointing to an
object that holds three books- horrorbook, mysterybook and fairytale. Each
are connected in a circular pattern outside of bookcase. In addition to this,
mysterybook also holds a reference to a bookmark object, connected to the outside
of the mysterybook node.
   To ensure the simplicity of the diagram, only objects created from classes
that were written directly by the programmer are displayed on the tree diagram.

                                        19
Figure 3.1. An application visualized in the Object Reference View

All other objects, including those defined in the Java SE6 API are hidden from
view. This is because many pre-defined or imported classes can have very large
definitions with lots of variables, that can overwhelm the diagram and make
navigation significantly more challenging. Hiding these objects from the tree
diagram ensures all information is relevant and easy to comprehend.

3.1.1     The Main Node

At the root of every single tree is the main node. When any Java applica-
tion starts, typically the first operation that is performed is executing the main
method, and the termination of any application occurs when this method is ex-
ited by the main thread of execution. However, as the main method is static, all
Java applications technically begin with no object being created or referenced;
with the visual model specified, there is no technical ‘root’ to the application’s
reference structure. To solve this issue, at the root of every tree is what is referred
to as the main node.
    The main node is not technically an object, but represents the stem of all
references from the beginning of the application. These references include every
local variable that is declared and initialized in the main method, and any static
field variables owned by the main method’s containing class. These two sources
combined represent the beginning point for every object that is created at the
start of an application. On the tree diagram, the main node is always given the

                                          20
label ‘main’, reflecting its method name. An example of a main node can be seen
in Figure 3.2.

3.1.2     Color Coding

To increase organization of the display, a color legend is employed. Each class
defined by the user is given a unique color code that determines the color of their
nodes on the tree. Each color is weighted depending on the number of references
an object holds- the more objects, the greater the saturation of the color. A legend
displaying the code from colors to objects is displayed in the top left corner of
the screen- mousing over any of the objects in the legend will highlight each node
of that class with a gradient that demonstrates the variation in saturation- the
lowest saturation of the gradient is used for objects containing no variables, while
the highest saturation shows objects containing over ten variables.
    Figure 3.2 shows a object reference visualisation next to its color code. The
application is a representation of a zoo, filled with animals of different types. Each
animal is assigned a unique color code depending on its class, which is then used
to represent them on the tree diagram. In this image, none of the animals contain
any object references, and have a very low color saturation, while the center node,
having six references has a relatively high color saturation. Saturation can be
compared when observing the gradient on the tree- the lowest saturation makes
up the left portion of the gradient, and the highest saturation makes up the right.
In this image, the class ‘Chimpanzee’ has been moused over, which has caused
the variable ‘ham’ to be highlighted- this helps in identifying individual classes,
particularly if some are of a similar color such as dog and rhinoceros.

3.1.3     Querying Nodes

The diagram’s interface allows the selection of variables to be performed with a
single mouse click, to query that particular. When a node is selected, a red box
surrounds the node, and its contents are displayed in the variable view window
on the right side of the screen. The variable view window contains a table of all
instance variables currently stored within that node- this information includes
the name and type of each of the variables, and where appropriate, the value of
the variable.
    Instances of classes not defined by the user are not displayed on the tree
diagram, but they are displayed on variable view, so that they can be observed.
This is also true for all primitive values, which are not technically objects but still
require a display. Selecting any node will display not only the object references
it possesses, but all primitive values as well. The variable view also displays any

                                          21
You can also read