Consider this simple C++ program:
#include <string>
std::string str = "hello world";
int main ()
{
return 0;
}
Compile it and start it under gdb. Look what happens when you print the string:
(gdb) print str
$1 = {static npos = 4294967295,
_M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x804a014 "hello world"}}
Crazy! And worse, if you’ve done any debugging of a program using libstdc++, you’ll know this is one of the better cases — various clever implementation techniques in the library will send you scrambling to the gcc source tree, just to figure out how to print the contents of some container. At least with string, you eventually got to see the contents.
Here’s how that looks in python-gdb:
(gdb) print str $1 = hello world
Aside from the missing quotes (oops on me), you can see this is much nicer. And, if you really want to see the raw bits, you can use “print /r“.
So, how do we do this? Python, of course! More concretely, you can register a pretty-printer class by matching the name of a type; any time gdb tries to print a value whose type matches that regular expression, your printer will be used instead.
Here’s a quick implementation of the std::string printer (the real implementation is more complicated because it handles wide strings, and encodings — but those details would obscure more than they reveal):
class StdStringPrinter: def __init__(self, val): self.val = val def to_string(self): return self.val['_M_dataplus']['_M_p'].string()
gdb.pretty_printers['^std::basic_string<char,.*>$'] = StdStringPrinter
The printer itself is easy to follow — an initializer that takes a value as an argument, and stores it for later; and a to_string method that returns the appropriate bit of the object.
This example also shows registration. We associate a regular expression, matching the full type name, with the constructor.
One thing to note here is that the pretty-printer knows the details of the implementation of the class. This means that, in the long term, printers must be maintained alongside the applications and libraries they work with. (Right now, the libstdc++ printers are in archer. But, that will change.)
Also, you can see how useful this will be with the auto-loading feature. If your program uses libstdc++ — or uses a library that uses libstdc++ — the helpful pretty-printers will automatically be loaded, and by default you will see the contents of containers, not their implementation details.
See how we registered the printer in gdb.pretty_printers? It turns out that this is second-best — it is nice for a demo or a quick hack, but in production code we want something more robust.
Why? In the near future, gdb will be able to debug multiple processes at once. In that case, you might have different processes using different versions of the same library. But, since printers are registered by type name, and since different versions of the same library probably use the same type names, you need another way to differentiate printers.
Naturally, we’ve implemented this. Each gdb.Objfile — the Python wrapper class for gdb’s internal objfile structure (which we briefly discussed in an earlier post) — has its own pretty_printers dictionary. When the “-gdb.py” file is auto-loaded, gdb makes sure to set the “current objfile”, which you can retrieve with “gdb.get_current_objfile“. Pulling it all together, your auto-loaded code could look something like:
import gdb.libstdcxx.v6.printers gdb.libstdcxx.v6.printers.register_libstdcxx_printers(gdb.get_current_objfile())
Where the latter is defined as:
def register_libstdcxx_printers(objfile): objfile.pretty_printers['^std::basic_string<char,.*>$'] = StdStringPrinter
When printing a value, gdb first searches the pretty_printers dictionaries associated with the program’s objfiles — and when gdb has multiple inferiors, it will restrict its search to the current one, which is exactly what you want. A program using libstdc++.so.6 will print using the v6 printers, and (presumably) a program using libstdc++.so.7 will use the v7 printers.
As I mentioned in the previous post, we don’t currently have a good solution for statically-linked executables. That is, we don’t have an automatic way to pick up the correct printers. You can always write a custom auto-load file that imports the right library printers. I think at the very least we’ll publish some guidelines for naming printer packages and registration functions, so that this could be automated by an IDE.
The above is just the simplest form of a pretty-printer. We also have special support for pretty-printing containers. We’ll learn about that, and about using pretty-printers with the MI interface, next time.
Apparently Fedora 10’s eclipse-ecj doesn’t have gcj-compiled libraries any more. Never mind:
mkdir /usr/lib/gcj/eclipse-ecj aot-compile -c "-O3" /usr/lib/eclipse/dropins/jdt/plugins /usr/lib/gcj/eclipse-ecj rebuild-gcj-db
Also, whilst I’m messing with my system, I’ve always had to do the following for ppc64 builds to work:
mkdir -p /usr/lib/jvm/java-gcj/jre/lib/ppc64/server ln -s /usr/lib64/gcj-4.3.2/libjvm.so /usr/lib/jvm/java-gcj/jre/lib/ppc64/server
I never figured out how anyone else manages without this. Maybe nobody else is trying to build two platforms on the one box.
One of the more obscure language changes included back in JDK 5 was the addition of hexadecimal floating-point literals to the platform. As the name implies, hexadecimal floating-point literals allow literals of the float and double types to be written primarily in base 16 rather than base 10. The underlying primitive types use binary floating-point so a base 16 literal avoids various decimal ↔ binary rounding issues when there is a need to specify a floating-point value with a particular representation.
The conversion rule for decimal strings into binary floating-point values is that the binary floating-point value nearest the exact decimal value must be returned. When converting from binary to decimal, the rule is more subtle: the shortest string that allows recovery of the same binary value in the same format is to be used. While these rules are sensible, surprises are possible from the differing bases used for storage and display. For example, the numerical value 1/10 is not exactly representable in binary; it is a binary repeating fraction just as 1/3 is a repeating fraction in decimal. Consequently, the numerical values of 0.1f and 0.1d are not the same; the exact numeral value of the comparatively low precision float literal 0.1f is
0.100000001490116119384765625
and the shortest string that will convert to this value as a double is
0.10000000149011612.
This in turn differs from the exact numerical value of the higher precision double literal 0.1d,
0.1000000000000000055511151231257827021181583404541015625. Therefore, based on decimal input, it is not always clear what particular binary numerical value will result.
Since floating-point arithmetic is almost always approximate, dealing with some rounding error on input and output is usually benign. However, in some cases it is important to exactly specify a particular floating-point value. For example, the Java libraries include constants for the largest finite double value, numerically equal to (2-2-52)·21023, and the smallest nonzero value, numerically equal to 2-1074. In such cases there is only one right answer and these particular limits are derived from the binary representation details of the corresponding IEEE 754 double format. Just based on those binary limits, it is not immediately obvious how to construct a minimal length decimal string literal that will convert to the desired values.
Another way to create floating-point values is to use a bitwise conversion method, such as doubleToLongBits and longBitsToDouble. However, even for numerical experts this interface is inhumane since all the gory bit-level encoding details of IEEE 754 are exposed and values created in this fashion are not regarded as constants. Therefore, for some use cases it helpful to have a textual representation of floating-point values that is simultaneously human readable, clearly unambiguous, and tied to the binary representation in the floating-point format. Hexadecimal floating-point literals are intended to have these three properties, even if the readability is only in comparison to the alternatives!
Hexadecimal floating-point literals originated in C99 and were later included in the recent revision of the IEEE 754 floating-point standard. The grammar for these literals in Java is given in JLSv3 §3.10.2:
HexFloatingPointLiteral:
HexSignificand BinaryExponent FloatTypeSuffixopt
This readily maps to the sign, significand, and exponent fields defining a finite floating-point value; sign0xsignificandpexponent. This syntax allows the literal
0x1.8p1
to be to used represent the value 3; 1.8hex × 21 = 1.5decimal × 2 = 3.
More usefully, the maximum value of
(2-2-52)·21023 can be written as
0x1.fffffffffffffp1023
and the minimum value of
2-1074 can be written as
0x1.0P-1074 or 0x0.0000000000001P-1022, which are clearly mappable to the various fields of the floating-point representation while being much more scrutable than a raw bit encoding.
Retroactively reviewing the possible steps needed to add hexadecimal floating-point literals to the language:
Update the Java Language Specification: As a purely syntactic changes, only a single section of the JLS had to updated to accommodate hexadecimal floating-point literals.
Implement the language change in a compiler: Just the lexer in javac had to be modified to recognize the new syntax; javac used new platform library methods to do the actual numeric conversion.
Add any essential library support: While not strictly necessary, the usefulness of the literal syntax is increased by also recognizing the syntax in Double.parseDouble and similar methods and outputting the syntax with Double.toHexString; analogous support was added in corresponding Float methods. In addition the new-in-JDK 5 Formatter "printf" facility included the %a format for hexadecimal floating-point.
Write tests: Regression tests (under test/java/lang/Double in the JDK workspace/repository) were included as part of the library support (4826774).
Update the Java Virtual Machine Specification: No JVMS changes were needed for this feature.
Update the JVM and other tools that consume classfiles: As a Java source language change, classfile-consuming tools were not affected.
Update the Java Native Interface (JNI): Likewise, new literal syntax was orthogonal to calling native methods.
Update the reflective APIs: Some of the reflective APIs in the platform came after hexadecimal floating-point literals were added; however, only an API modeling the syntax of the language, such as the tree API might need to be updated for this kind of change.
Update serialization support: New literal syntax has no impact on serialization.
Update the javadoc output: One possible change to javadoc output would have been supplementing the existing entries for floating-point fields in the constant fields values page with hexadecimal output; however, that change was not done.
In terms of language changes, adding hexadecimal floating-point literals is about as simple as a language change can be, only straightforward and localized changes were need to the JLS and compiler and the library support was clearly separated. Hexadecimal floating-point literals aren't applicable to that many programs, but when they can be used, they have extremely high utility in allowing the source code to clearly reflect the precise numerical intentions of the author.
Since the OpenJDK source was released, there have been various discussions about aspects of the build architecture, including how dependencies on third parties libraries should be managed. For example, on a Linux system should the JDK use its own libzip or the libzip that comes with the distribution? I think the appropriate answers to these and related question hinge on whether the final deliverable from the OpenJDK project is viewed as the source code itself or a binary built from the source.
Traditionally before OpenJDK, the end result of the Sun's JDK project that most people used were the JDK and JRE binaries. These binaries were meant to be universal in the sense of being usable on a given processor family across a version range of operating systems. For example, there was a single windows x86 binary for use on, say, Windows NT, Windows XP, etc., a single Solaris SPARC binary for use across Solaris 8 through Solaris 10, and effectively a single Linux x86 binary for use across different Linux distributions.
This "single binary" model drives decisions about what platform to produce the binary on, generally an older release, and what assumptions can be made about available system resources, generally rather weak ones. With a single binary deliverable, since fewer environment resources can be relied on, making the JDK build self-contained is necessary for it to be reliable in a wide variety of environments. With this delivery architecture, there is some justification to, for example, including a copy a library like libzip in the JDK build rather than relying on a system library, even though there are increased maintenance costs.
However, when an OpenJDK/IcedTea binary is being built on a particular Linux distribution for use only on that distribution, the constraints are different. If the build is being done by the OS vendor, the vendor controls the OS contents and knows whether or not system libraries like libzip are reliable and kept up to date. Since stronger assumptions can be made about the host environment, weaker conditions need to be fulfilled by the JDK source tree. For an OS vendor, relying on a single copy of native libraries for the OS and the JDK is preferable to building (and maintaining) multiple copies.
Going forward, I'd expect the JDK build to evolve to better accommodate options to use host platform resources. Perhaps modules systems in the future can help manage such dependencies more transparently.
The JCK tests probe the conformance of a Java platform implementation to a specification. For example, JCK 6b is the current test suite to determine conformance to the Java SE 6 spec. Official claims of conformance require not only passing the complete set of relevant tests, but also meeting the other requirements spelled out in the JCK user's guide.
Regarding the JCK, conformance is measured with respect to a binary rather than to a source base directly, which is sensible since a Java platform implementation will typically rely on and be affected by the properties of the host environment, including the OS, the C compiler, and the processor.
Previously Red Hat announced that using OpenJDK sources augmented with IcedTea patches, an OpenJDK binary on Fedora 9 passed the JCK and met the other conformance requirements.
Amongst other changes, after incorporating community-developed patches (notably 6748251 in b13) and following the OpenJDK 6 build instructions, inside Sun a binary resulting from building the unmodified OpenJDK 6 b13 sources on Redhat Enterprise AS 2.1 with gcc 2.95 (the official Linux build platform for Sun's 6 update releases) passed all the JCK 6b tests when run on Fedora Core 8 x86. That binary also meets all the other JCK requirements.
OpenJDK 6 binaries built from b13 (and later) sources on and for different host environments are now more likely to share those favorable conformance properties, but testing would be necessary to verify conformance status and to make any formal statements.
More information
Running with the usual jtreg flags, -a -ignore:quiet always and -s for the langtools area, the basic regression test results on Linux for OpenJDK 6 build 14 are:
HotSpot, 3 tests passed.
Langtools, 1,351 tests passed.
JDK, 3,077 tests pass, 26 tests fail, 3 tests have errors.
In this build, we upgraded from a HotSpot 10 base to HotSpot 11. HotSpot 11 is also being used in the 6u10 release. The set of included tests for the HotSpot versions differ:
0: b13-hotspot/summary.txt pass: 5 1: b14-hotspot/summary.txt pass: 3 0 1 Test pass --- compiler/6547163/Test.java pass --- compiler/6563987/Test.java pass --- compiler/6571539/Test.java pass --- compiler/6595044/Main.java --- pass compiler/6663621/IVTest.java --- pass compiler/6724218/Test.java 6 differences
In langtools all the tests continue to pass:
0: ./b13-langtools/summary.txt pass: 1,351 1: ./b14-langtools/summary.txt pass: 1,351 No differences
And in jdk, a few new tests were added in b14 and the existing tests have generally consistent results:
0: b13-jdk/summary.txt pass: 3,072; fail: 23; error: 3 1: b14-jdk/summary.txt pass: 3,077; fail: 26; error: 3 0 1 Test --- fail com/sun/org/apache/xml/internal/ws/server/Test.java pass fail java/awt/TextArea/UsingWithMouse/SelectionAutoscrollTest.html --- pass java/awt/image/ConvolveOp/EdgeNoOpCrash.java --- pass javax/management/monitor/DerivedGaugeMonitorTest.java pass fail javax/swing/JColorChooser/Test6541987.java --- pass javax/swing/JFileChooser/6484091/bug6484091.java --- pass sun/management/jmxremote/LocalRMIServerSocketFactoryTest.java --- pass sun/nio/cs/TestUTF8.java --- pass sun/security/ssl/com/sun/net/ssl/internal/ssl/SSLEngineImpl/EmptyExtensionData.java --- pass tools/pack200/MemoryAllocatorTest.java 10 differences
On December 3, the OpenJDK 6 b14 source bundle was published. Changes of note in this build:
All relevant security fixes in the current round of security updates (6592792, 6484091, 6766136, 6755943, 6751322, 4486841, 6721753, 6734167, 6733336, 6726779, 6588160, 6733959, 6497740).
Upgrading the HotSpot in OpenJDK 6 from HotSpot 10 to HotSpot 11 (6775772).
Fixing build documentation and other miscellaneous fixes (6775844, 6775851, 6683213, 6774170, 6741058, 6728126, 6745052).
This will be the last teamware based source drop; all future source drops will be based on the forthcoming public OpenJDK 6 Mercurial repositories.
Vision The JDK is big—and hence it ought to be modularized. Doing so would enable significant improvements to the key performance metrics of download size, startup time, and memory footprint.
Java libraries and applications can also benefit from modularization. Truly modular Java components could leverage the performance-improvement techniques applicable to the JDK and also be easy to publish in the form of familiar native packages for many operating systems.
Finally, in order to realize the full potential of a modularized JDK and of modularized applications the Java Platform itself should also be modularized. This would allow applications to be installed with just those components of the JDK that they actually require. It would also enable the JDK to scale down to smaller devices, yet still offer conformance to specific Profiles of the Platform Specification.
Okay—so where do we start?
JDK 7 As a first step toward this brighter, modularized world, Sun’s primary goal in the upcoming JDK 7 release will be to modularize the JDK.
There will be other goals, to be sure—more on those later—but the modularization work will drive the release, which we hope to deliver early in 2010.
Tools Modularizing the JDK requires a module system capable of supporting such an effort. It requires, in particular, a module system whose core can be implemented directly within the Java virtual machine, since otherwise the central classes of the JDK could not themselves be packaged into meaningful modules.
Modularizing the JDK—or indeed any large code base—is best done with a module system that’s tightly integrated with the Java language, since otherwise the compile-time module environment can differ dramatically from the run-time module environment and thereby make the entire job far more difficult.
Now—which module system should we use?
JSR 277 The current draft of this JSR proposes the JAM module system, which has been the subject of much debate and is far from finished. This system is intended to be at least partly integrated with the Java language. Owing to some of its rich, non-declarative features, however, it would be impossible to implement its core functionality directly within the Java virtual machine.
Sun has therefore decided to halt development of the JAM module system, and to put JSR 277 on hold until after Java SE 7.
JSR 294 This JSR, Improved Modularity Support in the Java Programming Language, is chartered to extend the Java language and the Java virtual machine to support modular programming. Its Expert Group has already discussed language changes that have been well received for their simplicity as well as their utility to existing module systems such as OSGi.
Earlier this year JSR 294 was effectively folded into the JSR 277 effort. Sun intends now to revive 294 as a separate activity, with an expanded Expert Group and greater community involvement, in support of the immediate JDK 7 modularization work as well as the larger goal of modularizing the Java SE Platform itself.
OSGi If JSR 277’s JAM module system is an unsuitable foundation for modularizing the JDK, what about the OSGi Framework? This module system is reasonably mature, stable, and robust. Its core has even already been implemented within a Java virtual machine, namely that of Apache Harmony. OSGi is not at all integrated with the Java language, however, having been built atop the Java SE Platform rather than from within it.
This last problem can be fixed. Sun plans now to work directly with the OSGi Alliance so that a future version of the OSGi Framework may fully leverage the features of JSR 294 and thereby achieve tighter integration with the language.
Jigsaw In order to modularize JDK 7 in the next year or so, and in order better to inform the work of JSR 294, Sun will shortly propose to create Project Jigsaw within the OpenJDK Community.
This effort will, of necessity, create a simple, low-level module system whose design will be focused narrowly upon the goal of modularizing the JDK. This module system will be available for developers to use in their own code, and will be fully supported by Sun, but it will not be an official part of the Java SE 7 Platform Specification and might not be supported by other SE 7 implementations.
If and when a future version of the Java SE Platform includes a specific module system then Sun will provide a means to migrate Jigsaw modules up to that standard. In the meantime we’ll actively seek ways in which to interoperate with other module systems, and in particular with OSGi.
All work on Project Jigsaw will be done completely in the open, in as transparent a manner as possible. We hope you’ll join us!
My thanks to Alex Buckley for comments on drafts of this entry.
I think the idea of backtrace filters (the topic of the previous post) is a pretty cool one. And, as I mentioned before, extending gdb with application-specific behavior is a compelling use for the Python scripting capability.
Remembering to source these snippets is a bit of a pain. You could, of course, stick a command into your ~/.gdbinit — that is pretty easy. I like things to be more automatic, though. Suppose someone writes a new filter — it would be nice to get it without having to edit anything.
Naturally, we provide an automatic mechanism for loading code — or I wouldn’t be writing this, would I?
Internally, gdb has an structure called an “objfile“. There is one of these for the inferior’s executable, and another one for each shared library that the inferior has loaded. A new one is also created when gdb loads separate debug info (typical for distros — not so typical for your own builds).
When gdb creates a new objfile, it takes the objfile’s file name, appends “-gdb.py“, and looks for that file. If it exists, it is evaluated as Python code.
Here’s a simple way to see this in action. Assuming you’ve been using the directory names I’ve used throughout this series, put the following into ~/archer/install/bin/gdb-gdb.py:
import gdb print "hi from %s" % gdb.get_current_objfile().get_filename()
Now run gdb on itself (remember — you should still have the archer install directory in your PATH):
$ gdb gdb
I get:
[...] hi from /home/tromey/archer/install/bin/gdb (gdb)
This naming scheme is ok-ish for stuff you just built, but not so for distros. We’ll be augmenting the search capability a bit so that we can hide the Python files away in a subdirectory of /usr/lib. I’m not sure exactly what we’ll do here, but it shouldn’t be hard to come up with something reasonable.
Another wrinkle is that this scheme does not work transparently for statically-linked executables. Ideally, we would have a way to automatically find these snippets even in this case. One idea that has been mentioned a few times is to put the Python code directly into the executable. Or, we could put the code next to the source. Both of these ideas have some drawbacks, though.
Note that one of these files might be loaded multiple times in a given gdb session — gdb does not track which ones it has loaded. So, I recommend that for “real” projects (something you ship, not just a local hack) you only put import commands (and a couple other idempotent operations, one of which we’ll discuss soon) into the auto-load file, and install the bulk of the Python code somewhere on sys.path.
Our next topic is something that many people have asked for over the years: application-specific pretty-printing. And, as we’ll see, this provides another use for auto-loading of Python code.
If you’re using the bzr-svn plugin to allow you to easily to interact with upstream projects using Bazaar, how do you find out the Subversion revision number that you happen to be at? (The bzr revno is, in general, going to be something different)
Seems to be a common question; I found myself asking it today. Turns out that the bzr-svn plugin adds a line to the bzr log output telling you the answer:
$ cd ~/vcs/gnome-util/trunk
$ bzr pull
$ bzr log | less
------------------------------------------------------------
revno: 6852
svn revno: 8199 (on /trunk)
committer: gforcada
timestamp: Tue 2008-11-25 12:52:19 +0000
message:
Updated Catalan translation
------------------------------------------------------------
revno: 6851
svn revno: 8197 (on /trunk)
committer: gforcada
timestamp: Tue 2008-11-25 12:37:53 +0000
message:
Updated Catalan documentation
------------------------------------------------------------
revno: 6850
svn revno: 8194 (on /trunk)
committer: ebassi
timestamp: Mon 2008-11-24 22:45:28 +0000
message:
2008-11-24 Emmanuele Bassi
* gnome-screenshot.c: (save_options): Save the include-pointer
setting to GConf.
------------------------------------------------------------
...
8199, apparently. :)
Nice feature!
Thanks to Matt Nordhoff for pointing this out to me in #bzr.
AfC
With the call for papers for JavaOne 2009 out, I thought it was high time to belatedly publish the slides for my JavaOne 2008 bof Tips and Tricks for Using Language Features in API Design and Implementation.
The session feedback from attendees of my talk was consistent on there being too much material gone over too rapidly. So if I revisit presenting this material in the future, I plan to split the talk in two, one part on kinds of compatibility and another more focused on using language features in API design.
To provide some context for the slides, here are some excerpts of the talk.
Leading up to JavaOne, I had been thinking a lot about compatibility, both in general terms as well as understanding the compatibility properties of previous API work and possible future changes. Besides being a central constraint on API evolution and general evolution of the JDK, compatibility also turned out to be surprisingly complicated. At some point, I'd like to writeup further thoughts on the acceptable compatibility region in the three-dimensional space of source, binary, and behavioral compatibility.
I feel there is considerable unrealized potential to have more commonly used program analysis and checking based on annotation processing, now built into Java SE 6 compilers. For example, I think it would be an interesting programming exercise to write an annotation processor to review the source and binary compatibility impacts of an API change. A much simpler example discussed in the bof is an annotation processor to find methods and constructors that are candidates for conversion to var-args.
A few times in my API work, I've seen that apparently conflicting goals can be met simultaneously by combining different language features, such as JSR 269 annotation processors being able to both use annotations to specify return values while still having a well-typed interface. So for those facing similar challenges, persevere! The solution may be just around the corner.
The last significant section of the talk is a brief defense of Java generics, a topic worthy of future elaboration. While very complicated in the worst cases, many common use cases are straightforward.
Although I find these technical subjects interesting, I expect to be submitting talk proposals on other matters for JavaOne 2009.
You’ll want to update and rebuild your python-gdb before trying this example — I found a bug today.
We’ve learned some of the basics of scripting gdb in Python. Now let’s do something really useful.
Many projects provide a customized backtrace-like gdb command, implemented using “define“. This is very common for interpreters — Python provides a command like this, and so does Emacs. These commands let the user show the state of the interpreted code, not just the state of the interpreter. This is ok if you are conversant with the new gdb commands provided by the program you are debugging, and — more importantly — your program only requires one such special command at a time.
It would be much nicer if the application-specific munging were done by the default backtrace command. And, it would be great if multiple such filters could be used at once.
This, by the way, is a theme of the python-gdb work: gdb is more useful when its standard commands can act in application- or library-specific ways. In the new model, gdb provides the core framework, and the application provides some Python code to make it work better.
I’m sure you know what is coming. We’ve reimplemented backtrace, entirely in Python. And, in so doing, we’ve added some functionality, namely filtering and reverse backtraces (I like gdb’s approach just fine, but, strangely to me, multiple people have wanted the display to come out in the opposite order from the default).
The new backtrace works by creating a frame iterator object — a typical Pythonic idea. The frame iterator simply returns objects that act like gdb.Frame, but that also know how to print themselves. (Hmm… we should probably just add this directly to Frame and avoid a wrapper.) The backtrace command then simply iterates over frames, printing each one. Simple stuff.
You can easily see how to implement a filter here: it is just a frame iterator, that accepts some other frame iterator as an argument. When backtrace requests the next frame, the filter can do as it likes: stop iterating, request a frame from the underlying iterator and modify or drop it, synthesize a new frame, etc. Because filters are required to return frame-like objects, these filters can easily be stacked.
Here’s a real-life example. This filter is based on Alexander Larsson’s work, and removes glib signal-emission frames. (This could easily be extended to do everything his does, namely emit information about the signal and instance.)
import gdb
class GFrameFilter:
def __init__ (self, iter):
self.iter = iter
self.names = ['signal_emit_unlocked_R',
'g_signal_emit_valist',
'g_signal_emitv',
'g_signal_emit',
'g_signal_emit_by_name']
def __iter__ (self):
return self
def wash (self, name):
if not name:
return "???"
if name.startswith("IA__"):
return name[4:]
return name
def next (self):
while True:
frame = self.iter.next ()
name = self.wash (frame.get_name ())
if name in self.names:
continue
if name == 'g_closure_invoke':
frame = self.iter.next ()
name = self.wash (frame.get_name ())
if name.find ('_marshal_') != -1:
continue
break
return frame
gdb.backtrace.push_frame_filter (GFrameFilter)
This should be pretty clear. The constructor takes a single argument — another frame iterator. The new iterator’s next method simply discards frames that the user is unlikely to care about; it relies on the underlying iterator to terminate the iteration by raising an exception.
Simply source this into your gdb. Then, load the new backtrace command:
(gdb) require command backtrace
For the time being, the new command is called “new-backtrace“; we’ll change this later once we are sure that the new command is working as desired. If you decide you want an unfiltered backtrace, you can simply invoke “new-backtrace raw“.
You can also easily see how this is useful in modern, mixed-library programs. For instance, the above signal emission filter would compose easily with a filter that nicely displayed Python frames; you could use this if you had a program that mixed Gtk and Python. And, you wouldn’t have to learn any new application- or library-specific commands.
What if you want to configure the filters? Say, drop one, or make one become more verbose? We haven’t built anything like that into new-backtrace, but filters can easily provide their own configurability via the Parameter class we discussed earlier.
Now, wouldn’t it be cool if library-specific backtrace filters were automatically available whenever the inferior loaded the corresponding library? That would make an awesome user experience… and that is what we’ll cover next time.
So. In order to solve the financial crisis, german politicians are now considering to give out coupons to increase customer demand. How stupid is that? Sounds like a nice wait to create a (even more serious) financial crisis to me. Oh wait…
I suppose I should become politician, I’m sure I can think out even more stupid ideas. For example this: We cut the politicians all their salary to, say, what they think should be enough for unemployed persons, that is around 400€ or less. (This should not even hurt the politicians, since they get paid their flights, meals, houses, etc anyway…). Then give this money out to those who really need it for living and for paying their loans. Also, throw away all their bank-rescue-plans. I mean, in good times all the finance people and banks are crying for less state control and more capitalism. Now they want more state control and more socialism? WTF? Let the ill-managed banks go down, give the rescue money to their customers (who don’t deserve to go down with their banks usually), and let the healthy banks (yeah, there are a few…) survive. Evolution should do the rest.
Stupid system that discourages intelligent ideas and encourages greediness in politics. No wonder we have a financial crisis now…

Finally, after months of delays and tweaks, FTP 0.2 is out! FTP is the FTP application in GAP for GNUstep and Macintosh.
Externally, few changes can be noticed: a new icon, the correct fixed pitch font in the log window and little more.
Internally, several changes happened though. Two are the main news.
First, the socket core was rewritten not to use file operations: on non-Unix systems sockets may not be files. This change allows, after some additional effort, to run FTP on windows. Thus FTP now runs natively on the Macintosh, runs on GNUstep on unix systems like BSD, Solaris or Linux and now on Windows too.
Second, the data connections are handled in a separate thread, which talks back to the main thread using DO (Distributed Objects). This allows the UI to remain responsive during list and download operations: this wasn't a real issue on the Macintosh, but on GNUstep windows wouldn't even redraw their contents during download, making the progress bar pretty useless. I still do not allow concurrent downloads, since the FTP protocol is not designed for that and it needs some workarounds.
Enjoy! Be sure to have the latest version of GNUstep base, since the DO system contains some fixes which are needed for FTP to work correctly.
The story so far The JDK is big—and so we ought to modularize it. Doing so would allow us to improve the key performance metrics of download size, startup time, and memory footprint.
Java libraries and applications can also benefit from modularization. Truly modular Java components could leverage the performance-improvement techniques applicable to the JDK itself, and also be easy to publish as familiar native packages for many operating systems.
Connecting the dots Given a modularized JDK and a modularized application, the next logical step is to arrange for the application’s modules to declare dependences upon just the JDK modules that it requires.
This would enable a capable native package manager such as rpm or apt to download and install just the JDK modules that are needed to run the application.
This would also enable a next-generation Java Kernel to download, initially, just the application and JDK modules needed to start the application. Once the application has started up then the Kernel would download additional modules on demand, as they’re requested by the program; in the meantime it would trickle-download all the remaining modules in the background.
JDK != Java Platform The problem with this scenario is that it would leave applications, and any libraries upon which they depend, tightly bound to the JDK and hence not portable in the sense of the “write once, run anywhere” ideal. The JDK is, after all, just one implementation among many of the Java SE Platform.
Omitting some dots Small devices are becoming faster all the time, with many of them now quite capable of running a HotSpot virtual machine and the core classes of the JDK. Their persistent-storage capacities, however, are growing at a slower rate.
Manufacturers of embedded devices and high-end cell phones generally aren’t willing to increase the cost of a product in order to pay for the memory required to carry around libraries that are never actually used by applications. With a modular JDK they’d have the option of carrying just the modules that will be used by expected applications, and with modular applications they’d be able to verify, at either build or install time, the availability of the necessary JDK modules.
The monolithic Java Platform The further problem with this second scenario is that no proper subset of a modular JDK is an implementation of the Java SE Platform, since the Platform itself is monolithic and indivisible. Put another way, no such subset can be tested for conformance against any particular Java SE Platform Specification, since the Specification itself is monolithic and indivisible.
Meta-modularization By now the solution to the above two problems is, no doubt, obvious: The Java SE Platform should be modularized.
This process could start with the definition of an abstract “Java SE” module whose version number would be that of the Java SE Platform Specification that defines it. The content of that specification would, as usual, be determined by the Expert Group for the corresponding Java SE Platform JSR under the auspices of the Java Community Process.
The primary module of the JDK—which would presumably be an aggregation of the concrete modules comprising the JDK—could then be amended to declare that it implements the Java SE module, much as a Java class can declare that it implements an interface. Similar adjustments could be made to other Platform implementations.
Once this is done then a modularized application could declare that it depends upon a specific version of the SE Platform—but not upon any particular implementation thereof. At runtime it would be up to the underlying module system to locate a Platform implementation capable of fulfilling this abstract requirement.
Divide and conquer To finish addressing the two problems described above, the next version of the Java SE Platform Specification could define specific common subsets—or Profiles—of the Platform. There could be, e.g., Profiles for minimal headless applications, for simple RIA-style desktop applications, for richer desktop applications, for server applications, and for other configurations at various points in between. The Java Compatibility Kit would correspondingly be divided into subsets of tests for each Profile.
Each Profile would be represented by its own abstract module. For each of the Profiles that it supports, a conforming implementation of the Platform would be required to include a concrete module that implements the corresponding abstract module.
Once all this is done then a modularized application could declare that it depends upon a specific versioned Profile of the Platform. An application delivered as familiar native packages, or via a Java-Kernel-like mechanism, would no longer be bound to any particular Platform implementation. A small device, finally, could include just the subsets of the Platform that are required by the applications expected to run upon it—yet still claim conformance to the Platform Specification.
Now wouldn’t that be truly and completely cool?
The only remaining question is: How do we get there?
So, the following configuration is supported by Cambridge no worse than Mac hardware by Mac OS: Intel Core 2 Duo T7800 with GMA X3100, Dell Wireless 1490 802.11a/g, 1680 x 1050 resolution.
It has been a few days, and we’ve pushed a few changes. So, you should update your gdb and rebuild it before continuing with the examples.
In addition to ordinary commands, gdb has “set/show” commands, which are basically a way to manipulate various global variables that control aspects of gdb’s behavior. The Python API to gdb calls these “parameters”, and provides a way to create them — this is similar to gdb.Command, which we’ve already learned about, but a bit simpler.
Here’s how our “backtrace” reimplementation (which we’ll cover in detail later) defines a new parameter for controlling how the trace is printed:
import gdb class ReverseBacktraceParameter (gdb.Parameter): """The new backtrace command can show backtraces in 'reverse' order. This means that the innermost frame will be printed last. Note that reverse backtraces are more expensive to compute.""" set_doc = "Enable or disable reverse backtraces." show_doc = "Show whether backtraces will be printed in reverse order." def __init__(self): super (ReverseBacktraceParameter, self).__init__ (self, "reverse-backtrace", gdb.COMMAND_STACK, gdb.PARAM_BOOLEAN) # Default to compatibility with gdb. self.value = False
Parameters have a name, a command category, and a type. The available types are described in the manual. The current setting is available via the “value” attribute.
I think parameters are a nice feature when you are polishing your gdb extensions for more general use. Otherwise… we’ll, they’re there.
In an earlier post, we saw the “require” command for the first time. This is a new command, written in Python, that dynamically loads new Python commands and functions. I wrote this command to let people experiment a bit — and I didn’t want to automatically load all the extension commands. I’m not completely sure this will stick around (your opinion matters here…), but in the meantime, you can add your own commands to it quite easily.
The “require” command works by simply importing a Python module of the correct name. And, its completion feature works by searching a certain directory in the gdb install tree for “.py” files in the expected package. So, adding your own commands and functions is as simple as putting them into the correct directory.
For example, we can make the above command visible like so:
$ cp /tmp/example ~/archer/install/share/gdb/python/gdb/command/reverse-param.py
Now try the require command — completion should see your new file. If you run the require command, “show reverse-backtrace” should start working.
Convenience functions can be required the same way — they live in the “gdb.function” package.
Having Python extensions which are themselves extensible — like require — is an emerging theme of python-gdb. This sort of hackery is much more natural in a dynamic scripting language. Next time we’ll dive into another case of this: the filtering backtrace implementation.
Space is big. Really big. You just won’t believe how vastly, hugely, mindbogglingly big it is. I mean, you may think it’s a long way down the road to the chemist, but that’s just peanuts to space.
— Douglas Adams, The Hitchhiker’s Guide to the Galaxy
The JDK is big, too—though not (yet) as big as space.
It’s big because over the last thirteen years the Java SE platform has grown from a small system originally intended for embedded devices into a rich collection of libraries serving a wide variety of needs across a broad range of environments.
It’s incredibly handy to have such a large and capable Swiss-Army knife at one’s disposal, but size is not without its costs.
Size The JDK and its runtime subset, the JRE, have always been delivered as massive, indivisible artifacts. The growth of the platform has thus inevitably led to the growth of the basic JRE download, which now stands at well over 14MB despite heroic engineering efforts such as the Pack200 class-file compression format.
Complexity The JDK is big—and it’s also deeply interconnected. It has been built, on the whole, as a monolithic software system. In this mode of development it’s completely natural to take advantage of other parts of the platform when writing new code or even just improving old code, relying upon the flexible linking mechanism of the Java virtual machine to make it all work at runtime.
Over the years, however, this style of development can lead to unexpected connections between APIs—and between their implementations—leading in turn to increased startup time and memory footprint. A trivial command-line “Hello, world!” program, e.g., now loads and initializes over 300 separate classes, taking around 100ms on a recent desktop machine despite yet more heroic engineering efforts such as class-data sharing. The situation is even worse, of course, for larger applications.
Palliatives The Java Kernel and Quickstarter features in the JDK 6u10 release do improve download time and (cold) startup time, at least for Windows users. These techniques really just address the symptoms of long-term interconnected growth, however, rather than the underlying cause.
The modular JDK The most promising way to improve the key metrics of download time, startup time, and memory footprint is to attack the root problem head-on: Divide the JDK into a set of well-specified and separate, yet interdependent, modules.
The process of restructuring the JDK into modules would force all of the unexpected interconnections out into the open where they can be analyzed and, in many cases, either hidden or eliminated. This would, in turn, reduce the total number of classes loaded and thereby improve both startup time and memory footprint.
If we had a modular JDK then at download time we could deliver just those modules required to start a particular application, rather than the entire JRE. The Java Kernel is a first step toward this kind of solution; a further advantage of having well-specified modules is that the download stream could be customized, in advance, to the particular needs of the application at hand.
Now, wouldn’t all that be cool?
Going further, the modularization process could be applied not just to the JDK but to libraries and applications themselves so as to improve these metrics even more. Doing so might also enable us to address some other longstanding problems related to the packaging and delivery of Java code.
In the beginning Java applets were originally just collections of class files and other resources, such as sound and image files, published on a web server and downloaded over HTTP—one at a time—by early web browsers.
This approach was, of course, not terribly efficient in terms of bandwidth, and so the user experience over a slow connection was pretty unpleasant. There was, moreover, no way to cryptographically sign an applet in order to guarantee its integrity and authenticate its publisher.
Thus in 1996 was born the JAR file format, a simple extension of the popular ZIP archive format with manifest and signature files. In the absence of a better alternative the JAR format rapidly became the standard way to package reusable Java libraries and, with the advent of the Main-Class manifest header, even entire applications. Experience has shown, however, that the JAR format is not very well-suited to these more-ambitious uses.
Failures of expedience A JAR file is little more than a set of classes packaged into a single unit which can be transported over the wire, cryptographically verified, and, ultimately, placed on a traditional class loader’s class path. Code packaged in JAR files is hence susceptible to all the usual problems of the class-path model, in particular those arising from the fact that all classes on the path are in the same flat namespace with their visibility determined only by the order, on the path, of their containing JAR files.
There are many ways in which this model can fail. The most common failure mode occurs when two versions of a library are inadvertently placed on the class path. In this situation chaos is likely to ensue—and be very difficult to debug. Application code might appear to use the older version, the newer version, or some bizarre combination of the two, all depending on the exact differences between the versions and their placement on the class path.
This failure mode, and others, have been encountered by so many unwary developers over the years that this general set of problems has been dubbed “JAR hell.”
Modular code If we’re going to modularize the JDK then why not do the same for Java libraries and applications? If we have the facilities required to divide the JDK into a set of well-specified and separate, yet interdependent, modules then we should be able to leverage those same tools even further in order to climb out of JAR hell.
This can work because the metadata in a module is much richer than that in a simple JAR file. A module’s metadata describes its own version as well as the dependences it has upon other modules. The dependences can themselves be constrained with respect to version numbers so that, e.g., a module X can declare that it needs version 1.2.3 of module Y, or any later version.
At run time the classes in a module are not simply added to a class path. The loading and linking of the classes within a module is, rather, guided by the dependence and versioning constraints in the module’s metadata. During this process care is taken to ensure that classes in one module are visible to classes in another only when intended, and that no module is ever linked to more than one version of another.
Packaging Java libraries and applications as modules would allow us to ascend from JAR hell into the clear light of day, where versioning and dependence information is declared at compile time and then leveraged during distribution, installation, and run time. The techniques for improving download time, startup time, and memory footprint applicable to a modular JDK would, moreover, be equally applicable to modularized Java libraries and applications. Truly modular Java components would, finally, also enable us to address at least one other longstanding problem area …
Native packaging One of the age-old critiques of the Java platform is that it doesn’t integrate very well with the native operating systems upon which it runs. An oft-cited aspect of this critique is that the usual means of packaging Java code—i.e., JAR files—has no well-defined relationship to native packaging systems.
Many an application developer has coped with this impedance-mismatch problem by creating, for each target platform, a native package containing the JAR files for the application, the JAR files for all required libraries—and an entire JRE.
This is an effective solution, but a crude one which introduces problems of its own. Delivering monolithic, self-contained native packages wastes both download time and disk space. More importantly, the lack of sharing of common components—whether of libraries or of the JRE itself—makes it impossible to update those components independently in order to fix security bugs and other critical issues.
A different approach to the impedance-mismatch problem is to address it directly, by building suitable native packages for Java libraries and applications that were originally delivered as JAR files. The enterprising developers behind the JPackage Project have done exactly this, for RPM-based Linux platforms, for a wide variety of popular Java components.
A large part of the time spent in this sort of effort, whether for RPM or for any other reasonably-capable native packaging system—e.g., Debian, SVR4, or IPS—rests in identifying and encoding inter-component dependences. Simple JAR files do not contain such metadata, so in many cases quite a bit of tedious detective work is required.
If Java libraries and applications were packaged as modules, complete with accurate version and dependence metadata, then it would be almost trivial to transform them into sensible native packages. One could even imagine a single tool, delivered as part of the JDK, that would implement this transformation for common native platforms, at least for simple cases.
Now—that would be cool.
More ikvmc testing revealed a couple of IKVM.Reflection.Emit bugs and also some bugs that have been lurking in ikvmc for a long time, that only show up under fairly uncommon scenarios when lots of dependencies are missing (and hence the resulting assembly has little chance of working anyway). The fixes are small and relatively low risk, but I'm not sure whether it is worthwhile to back port them to 0.36 and 0.38. Feedback on this is appreciated.
Changes:
As always with a development snapshot, don't use this in production, but please do try it out and let me know about it. The sources are available in cvs and the binaries here: ikvmbin-0.39.3257.zip
In the previous installment we found out how to write new gdb commands in Python. Now we’ll see how to write new functions, so your Python code can be called during expression evaluation. This will make gdb’s command language even more powerful.
A longstanding oddity of gdb is that its command language is unreflective. That is, there is a lot of information that gdb has that is extremely difficult to manipulate programmatically. You can often accomplish seemingly impossible tasks if you are willing to do something very ugly: for instance, you can use “set logging” to write things to a file, then use “shell” to run sed or something over the file to produce a second file in gdb syntax, then “source” the result. This is a pain, not to mention slow. The Python interface removes this hackery and replaces it with what you really want anyway: an ordinary programming language with a rich set of libraries.
If you use gdb regularly, then you already know about convenience variables. These are special variables whose names start with a dollar sign, that can be used to name values. They are part of gdb’s state, but you can freely use them in expressions:
(gdb) set var $conv = 23 (gdb) call function ($conv)
On the python branch, we introduce convenience functions. These use a similar syntax (in fact the same syntax: they are actually implemented as convenience variables whose type is “internal function”), and when invoked simply run a bit of Python code. As you might expect after reading the previous post, a new function is implemented by subclassing gdb.Function. Our example this time is the one that got me hacking on python-gdb in the first place: checking the name of the caller.
import gdb
class CallerIs (gdb.Function):
"""Return True if the calling function's name is equal to a string.
This function takes one or two arguments.
The first argument is the name of a function; if the calling function's
name is equal to this argument, this function returns True.
The optional second argument tells this function how many stack frames
to traverse to find the calling function. The default is 1."""
def __init__ (self):
super (CallerIs, self).__init__ ("caller_is")
def invoke (self, name, nframes = 1):
frame = gdb.get_current_frame ()
while nframes > 0:
frame = frame.get_prev ()
nframes = nframes - 1
return frame.get_name () == name.string ()
CallerIs()
Let’s walk through this.
The Python docstring is used as the help string in gdb. Convenience function documentation is shown by typing “help function” in the CLI.
Function’s initializer takes one argument: the name of the function. We chose “caller_is“, so the user will see this as “$caller_is“.
When the function is invoked, gdb calls the invoke method with the arguments that the user supplied. These will all be instances of the gdb.Value class. A Value is how a value from the debuggee (called the “inferior” in gdb terminology) is represented. Value provides all the useful methods you might expect, many in a nicely pythonic form. For example, a Value representing a integer can be added to Python integers, or converted to one using the built-in int function. (See the manual for a full rundown of Value.)
gdb just tries to pass all the arguments the user supplied down to the invoke method. That is, gdb doesn’t care about the function arity — so you can do fun things, like implement varargs functions, or, as you see above, use Python’s default value feature.
This invoke method takes a function name and an optional number of frames. Then it sees if the Nth frame’s function name matches the argument — if so, it returns true, otherwise it returns false. You can use this to good effect in a breakpoint condition, to break only if your function was called from a certain place. E.g.:
(gdb) break function if $caller_is("main")
(Hmm. Perhaps we should have just let the user register any callable object, instead of requiring a particular subclass.)
Value and its as-yet-unintroduced friend, Type, are actually a bit like Python’s ctypes — however, instead of working in the current process, they work on the process being debugged. I actually looked a little at reusing the ctypes API here, but it was not a good fit.
Next we’ll revisit commands a little; specifically, we’ll see how to write a new “set” command, and we’ll explore the “require” command. After that we’ll go back to writing new stuff.
My main interest in this release is seeing the LiveConnect work that I started in IcedTea finished and released. Deepak Bhole did an amazing job taking over where I left off — IcedTeaPlugin was prototype quality when I handed it over. He completed the major features and then polished the result into the robust plugin that appears in Fedora 10. Thank you Deepak!
Today, I had my first “Just Works” experience with the plugin when reading about Biosphere 2. The Cortado video player applet in that page first asked me if it could open a connection to upload.wikimedia.org and then played the video. Adjusting the PulseAudio Mixer volume control works for this applet, which may mean that IcedTeaPlugin is also using the PulseAudio backend written by my excellent interns, Ioana Ivan and Omair Majid. It’s nice to see that work released too!
In general Fedora 10 is nice. All the hardware on my HP Pavilion dv5t works fine, though the 3D graphics performance is not good. The graphical boot feature doesn’t seem to be enabled for my graphics card yet which is a little disappointing but booting does feel faster with this release.
I’m currently using swfdec 0.9.2 (built from source) as my Flash plugin. It’s getting there! The only thing preventing YouTube from working acceptably is the lack of support for the AAC audio codec. It’s nice to see progress on this very important component of the Free Software desktop!
I’ve also got Ekiga set up but I’ll have to convince some of my Skype-using friends to try it with me.
Looking forward to Fedora 11, I’m glad to see the volume control nightmare coming to an end and I’m super-excited about the Windows cross-compilation efforts. My wish list at this point is pretty short: kernel modesetting enabled for my graphics hardware, better 3D graphics performance, and to the Red Hat Java Team: try to get IcedTeaPlugin on the Fedora LiveCD!
If you’re like me, you’ve probably wished you could write new commands in gdb. Sure, there is the define command — which I’ve used heavily — but it has some annoying limitations. For example, you can’t make new sub-commands using define. Also, arguments to define are parsed oddly; you can’t have an argument that has a space in it. Instead, the user has to do the quoting, unlike other gdb commands.
Naturally, the Python extensions to gdb make these problems trivial to solve. In this post, I’ll show you how to implement a new command to save breakpoints to a file. This is a feature I’ve often wanted, and which, strangely, has never been written.
I think we’ll call our new command “save breakpoints“. Given the existence of “save-tracepoints” you might think that “save-breakpoints” is a better choice; but I never liked the former, and plus this gives me a chance to show you how to make a new prefix command. Let’s start there.
A command written in Python is implemented by subclassing gdb.Command:
import gdb
class SavePrefixCommand (gdb.Command):
"Prefix command for saving things."
def __init__ (self):
super (SavePrefixCommand, self).__init__ ("save",
gdb.COMMAND_SUPPORT,
gdb.COMPLETE_NONE, True)
SavePrefixCommand()
That is simple enough. Command’s init method takes the name of the command, a command class (the choices are listed in the manual), the completion method (optional; and again, documented), and an optional final argument which is True for prefix commands.
You can try the above very easily. Save the above to a file and then source it into gdb, telling gdb that it is Python code:
(gdb) source -p /tmp/example
Now take a look:
(gdb) help save Prefix command for saving things. List of save subcommands: Type "help save" followed by save subcommand name for full documentation. Type "apropos word" to search for commands related to "word". Command name abbreviations are allowed if unambiguous.
Note how the class documentation string turned into the gdb help.
Now we’ll write the subcommand which does the actual work. It will take a filename as an argument, and will write a series of gdb commands to the file. This will make it simple to restore breakpoints — just source the file.
As you might expect, many of gdb’s internal data structures have Python analogs. We’ve seen commands, above. Our new command will examine Breakpoint objects. The gdb manual has documentation for most of these new classes (but not all, yet. In the early days of this work we were quite lax … we’re still catching up).
from __future__ import with_statement
import gdb
class SaveBreakpointsCommand (gdb.Command):
"""Save the current breakpoints to a file.
This command takes a single argument, a file name.
The breakpoints can be restored using the 'source' command."""
def __init__ (self):
super (SaveBreakpointsCommand, self).__init__ ("save breakpoints",
gdb.COMMAND_SUPPORT,
gdb.COMPLETE_FILENAME)
def invoke (self, arg, from_tty):
with open (arg, 'w') as f:
for bp in gdb.get_breakpoints ():
print >> f, "break", bp.get_location (),
if bp.get_thread () is not None:
print >> f, " thread", bp.get_thread (),
if bp.get_condition () is not None:
print >> f, " if", bp.get_condition (),
print >> f
if not bp.is_enabled ():
print >> f, "disable $bpnum"
# Note: we don't save the ignore count; there doesn't
# seem to be much point.
commands = bp.get_commands ()
if commands is not None:
print >> f, "commands"
# Note that COMMANDS has a trailing newline.
print >> f, commands,
print >> f, "end"
print >> f
SaveBreakpointsCommand ()
I think the init method is fairly straightforward, given our earlier class definition. The name of the command is “save breakpoints” — the Command initializer will parse this properly for us. The command takes a filename as an argument, and because there is a built-in filename completer, we simply ask gdb to use that for us. This means that completion on the command-line will automatically do the right thing.
The invoke method is where we do the real work — this method is called by gdb when the the command is executed from the CLI. There are two arguments passed by gdb. The first is the argument to the command. This is either a string, if there is an argument, or None. The second argument is a boolean, and is True if the user typed this command at the CLI; it is False if the command was run as part of a script of some kind, for instance a .gdbinit or breakpoint commands.
The body of invoke is quite simple: we loop over all existing breakpoints, and write a representation of each one to the output file. This is simple, because the gdb command language is easy to emit, and because the Breakpoint class provides simple access to all the attributes we care about.
Note that the precise syntax of some things here is already slated to change. We’re going to drop the “get_” prefixes at least; and some class attributes will move from method form to “field” form. We’ll be making these changes soon, before there is too much code out there relying on the current API.
Adding new commands is easy! A few Python-based commands come with your gdb. You can use the new “require” command to load new commands dynamically. To see what is available, try:
(gdb) require command <TAB>
We’re also currently shipping an “alias” command, a simple version of “pahole” (find holes in structures), a replacement for “backtrace” which we’ll explore in a later post, and maybe more by the time you read this (I’m considering committing the “save breakpoints” command).
A new command isn’t always the best way to extend the debugger. For example, it is fairly common for an application to have a user-defined command to pretty-print a data structure; but wouldn’t it be nicer for the user if this was automatically integrated into the print command?
It would also sometimes be more convenient to hook into expression evaluation — that would make breakpoint conditions more powerful, among other things. This is what we’ll explore in the next post.
This is the first in what I hope will be a series on using the python-enabled gdb.
We’ll start at the very beginning: checking it out, building it, and then “hello, world”.
First, install the prerequisites — aside from the normal development stuff, you will need the python development packages. You will also need git. If you don’t have makeinfo installed, you will want that, too. On Fedora you can get these with:
$ sudo yum install python-devel git texinfo
That was easy! Now we will check out and build the gdb branch. I like to prepare for this by making a new directory where I will keep the source, build, and install trees:
$ mkdir -p ~/archer/build ~/archer/install $ cd ~/archer
“Archer” is the name of our project, in case you’re wondering. Note that the clone step will take a while — it is downloading the entire history of gdb. My source tree weighs in at about 50M.
$ git clone git://sourceware.org/git/archer.git $ cd archer $ git checkout --track -b python origin/archer-tromey-python
Now you have archer checked out and on the right branch. If you’re curious, other work is being done in the Archer repository. Currently each separate project has its own “topic branch”, but we’ll be merging them together for a release. You can see the different topics with:
$ git branch -r | grep archer
This repository is also periodically synced from gdb. For example you could look at the in-progress multi-process work — that is, patches to let gdb debug multiple processes at the same time — by checking out the branch gdb/multiprocess-20081120-branch.
Now build gdb. Depending on your machine, this might take a while.
$ cd ../build $ ../archer/configure --prefix=$(cd ../install && pwd) $ make all install
(As an aside: I have typed that goofy --prefix line a million times. I wish configure would just do that for me.)
Now you are ready to try your python-enabled gdb. We’ll just add the right directory to your path and then do a smoke test for now; we’ll look at more functionality in the next installment.
$ PATH=~/archer/install/bin:$PATH $ gdb [...] (gdb) python print 23 23 (gdb) quit
It worked!
Once you have this set up, future updates are even simpler — and faster. Just pull and rebuild:
$ cd ~/archer/archer $ git pull $ cd ../build $ make all install
We’re actively hacking on this branch, so you may like to do this regularly. If you find bugs, feel free to email the Archer list. (We’ll have bugzilla working “soon”, but for the time being just fire off a note.)
I think next time we will look at writing a new gdb command in Python. Also, we’ll try a couple of new commands, written this way, that are shipped with the Python-enabled gdb.
In the future, it will be much simpler to get this gdb. Like I said before, I want it to be in Fedora 11 (and yeah, making the feature page is on my to-do list; I’ll try to get to it next week). Also, I think Daniel is making a Debian experimental package for it.
Update: fixed a bug in the checkout command. Oops.
I am going to brag for just a second…
After I updated my system to F-10, I rebooted and sound worked without any further tinkering. WAIT… that’s not all. I plugged-in my cellphone, and Fedora immediately recognized that it had a camera on it, and I was able to transfer to/from my phone without any further tinkering. AND I plugged-in my iPod, and I was immediately able to transfer to/from it without any further tinkering.
I am sure there are several more improvements from Fedora 9, but at the moment, these are most important to me. Oh yeah, and Facebook’s Photo Uploader now works thanks to Tom and Deepak’s LiveConnect work in OpenJDK.

I have a number of clients, colleagues, and friends who live in Mumbai, a place that is sadly no stranger to trauma and which is going through it again. In such times we can only hope that our friends and their families have not been hurt in the unpleasantness that is presently besetting their world.
It is hard for outsiders to understand the confusion that embroils such situations; people reporting the news always seem to convey the air of knowing what is going on. Of course they don’t, but the people that can spot the discrepancy usually aren’t exactly watching television.
Being present during a terrorist attack is, sadly, an experience that far too many of us share. Many people find it difficult to remain calm in the face of crisis and chaos. Not for us, far away, to say that they should feel any differently.
What I can offer, however, is the absolute certainty that the one way above all others that we beat those who would tear down our society is by not falling prey to fear. Incidents are bad enough, but the real damage is afterward, when we allow the circumstances to themselves become a cause and a self-inflicted excuse for curtailing our freedoms. No. Returning to normal is not easy, but it’s how we win.
AfC
Yesterday evening I started implementing a Java2D pipeline on top of DirectFB. Today I can already run the full Java2Demo. Yay. I save myself the obligatory screenshots, becomes quite boring as they always look more or less the same. But I must say that I quite like DirectFB. If it was me, X11 could be ditched and free desktops built on top of DirectFB. Very nice and well thought-out API with a surprising lot of stuff in it (OpenGL, video-overlay, alpha-blending, etc, if supported by the provider of course).


Last night we had a Fedora 10 release party at Seneca@York. Thanks to Chris Tyler who organized the room and the pizza. A bunch of us went from the Red Hat office here in Toronto. We were unfortunately a bit late but once everything got started we discussed and demonstrated many of the new F10 features like Better Webcam Support, Eclipse 3.4, LiveConnect, VisualVM, and more. We also watched a video featuring the ever-amazing Dan Williams showing off NetworkManager connection sharing and troubleshooted (troubleshot?) a few installation issues people had.

The Freedom Toaster was continuously churning out DVDs.

I think it was a great event. Seneca’s got some awesome stuff going on and it was fun to be a part of it for an evening. Thanks to Fedora for sponsoring the food and pop!
Without much fanfare systemtap 0.8 was released a little while ago. There is one little tidbit in the release notes that does warrent some excitement though:
User space probing is supported at a prototype level, for kernels built with the utrace patches.
So what does that mean? Take for example the para-callgraph.stp script:
$ stap para-callgraph.stp 'process("/bin/ls").function("*")' -c /bin/ls
0 ls(12631):->main argc=0x1 argv=0x7fff1ec3b038
276 ls(12631): ->human_options spec=0x0 opts=0x61a28c block_size=0x61a290
365 ls(12631): <-human_options return=0x0
496 ls(12631): ->clone_quoting_options o=0x0
657 ls(12631): ->xmemdup p=0x61a600 s=0x28
815 ls(12631): ->xmalloc n=0x28
908 ls(12631): <-xmalloc return=0x1efe540
950 ls(12631): <-xmemdup return=0x1efe540
990 ls(12631): <-clone_quoting_options return=0x1efe540
1030 ls(12631): ->get_quoting_style o=0x1efe540
[...]
650290 ls(12631): <-print_current_files
650330 ls(12631): <-print_dir
650456 ls(12631): ->free_pending_ent p=0×1f02d90
650539 ls(12631): <-free_pending_ent
650660 ls(12631): ->close_stdout
650821 ls(12631): ->close_stream stream=0×376db6c780
650966 ls(12631): <-close_stream return=0×0
651082 ls(12631): ->close_stream stream=0×376db6c860
651164 ls(12631): <-close_stream return=0×0
651205 ls(12631): <-close_stdout
Another nice thing added in Fedora 10 is oprofile-jit, which enhances the system profiler with java support (for runtimes supporting jvmti/jvmpi, gcj native code was obviously already supported), just add -agentlib:jvmti_oprofile to your java invocation, and then opreport can give you stuff like:
samples % linenr info image name app name symbol name
136220 20.3345 (no location information) 21010.jo java Interpreter
15176 2.2654 indexSet.cpp:528 libjvm.so libjvm.so IndexSetIterator::advance_and_next()
12273 1.8321 (no location information) 21010.jo java int[] java.math.BigInteger.montReduce(int[], int[], int, int)
11129 1.6613 (no location information) 21010.jo java int java.text.CollationElementIterator.next()
9932 1.4826 (no location information) 21010.jo java java.lang.String com.sun.javatest.finder.JavaCommentStream.readComment()~1
9731 1.4526 (no location information) 21010.jo java java.nio.charset.CoderResult sun.nio.cs.UTF_8$Decoder.decodeArrayLoop(java.nio.ByteBuffer, java.nio.CharBuffer)
9239 1.3792 reg_split.cpp:409 libjvm.so libjvm.so PhaseChaitin::Split(unsigned int)
8617 1.2863 ifg.cpp:464 libjvm.so libjvm.so PhaseChaitin::build_ifg_physical(ResourceArea*)
A long time ago I wrote a tiny perl script that told you the time in various places. It was a somewhat unusual take on the usual approach to the timezone problem in that it displays offsets from where you are, not offsets from UTC (which, unless you’re in the UK in the winter time, are really kinda useless). A number of people liked using it, which was nice. It was called “slashtime” since /time is a shortcut on my company’s website to get to an HTML version of it.

For a while, though, I’d wanted to make a version that would be graphical; in addition to being more compact, I wanted it to be live and to help me with arranging meetings. So I did!
Here’s a screenshot of Slashtime running.
The new Slashtime inherited the original’s premise of showing offsets, of course, and adds some other nicities. When the sun is up is irrelevant in this day and age; but business hours aren’t (white background), as is knowing when it’s not a good time to call someone (the dark shading).
Knowing where you are is important too; that’s the blue line. There are a number of heuristics to try and figure that out, but if your Linux box’s /etc/localtime is a symlink to a file in /usr/share/zoneinfo like it’s supposed to be, you’re golden; it degrades gracefully from there, looking at what /etc/timezone says, then the TZ environment variable, etc, and doing all this in a hopefully OS aware way (there’s code in there that made it work Solaris, for example).
Oh, and yes, 01:30 is the fold point, quite deliberately. Hackers don’t go to bed at midnight. Perish the thought. So if the person you’re looking for is in the dark portion but at the bottom of the display, there’s every chance they’re still up :).
The list of places shown is specified in from a simple text file at ~/.tzlist (a default list will come up if you don’t have one). Instructions of how to set this file up properly to your own preferences is shipped with the program in the PLACES file. As you can see, I have quite a number of places in my .tzlist file, but there’s nothing wrong with just having two or three if those are the only places you want to know about.
The discussion in the PLACES example makes a point that might night be obvious at first glance: you control the names of the places shown. So if you live in Marseilles, and are tired of every other gizmo out there showing the time in “Paris”, you just go right ahead and put “Marseille” in your .tzlist file as:
"Europe/Paris" "Marseille" "France"
There’s also a meeting planner. Right-click the list and select “Meeting…” from the context menu:
and you can set the program to display a specific time and date somewhere in the world. More typically, you ned to hunt for a good time to have a phone meeting with someone; just move the sliders back and forth until you find a nice alignment for you and the other people on the call.

This example shows me working out that assuming I’m in Sydney that week, if I want to call my Mum to wish her a Happy Christmas, so long as I call just before I go to bed on 25 December it won’t be too early there (the red border is a warning that you’re not seeing current time displayed).
The GUI version of Slashtime has actually been around a long while; It’s written in Java and served as an early test bed for the java-gnome bindings of GTK and GNOME. Thanks to recent work by Serkan Kaba, however, the program is now properly internationalized. Not that there’s much to translate, but it’s important to at least set the foundation. Serkan did Turkish; I’ve done French Canadian (ahem, that’ll just go to show how rusty my Quebecois is). I must admit that I’m still pretty new to internationalization and localization, so I’m sure there’s room for improvement here.
Slashtime 0.5.9 was released this week with that branch merged. It’s packaged on Gentoo Linux as app-misc/slashtime. Building it yourself shouldn’t be hard; Thanks to people like Carl Worth and Rob Taylor it works out of the box on a number of other distros. You’ll need java-gnome >= 4.0.9. Just follow the instructions in the