opensource.google.com

Menu

Posts from October 2009

Google Summer of Code Mentor Summit 2009

Friday, October 30, 2009


This past weekend, approximately 250 Open Source developers from around the world gathered at Google's headquarters in Mountain View, CA for the fourth Google Summer of Code™ Mentor Summit. These developers who mentored students in this year's Google Summer of Code program gathered "unconference" style to discuss ways to improve the program, share their experiences, and learn about each other's projects.

One of the recurring comments about what makes the Mentor Summit special was that it gathers developers from a diverse range of projects (all 150 organizations participating in this year's Google Summer of Code were invited to send two delegates). This allowed for a cross pollination of ideas that isn't usually found at conferences dedicated to one specific platform or language. In addition, the summit was an opportunity for developers who usually collaborate online to meet face to face. In fact, some of our attendees met colleagues they had been working with for several years in person for the first time at the summit!

Most of all, the summit was a great place to meet like minded Open Source developers who are passionate about bringing in new contributors to their communities. Check out photos from the event or read through the session notes to find out more about what happened at this year's summit.

Fall at the OSPO

Friday, October 23, 2009

The leaves are turning here in Mountain View, but they are not the only ones blazing away. It's a busy time of year for open source for Google, with lots of talks and events going on.

Recently:
- Ben Collins-Sussman and Brian (Fitz) Fitzpatrick gave their "Myth of the Genius Programmer" talk as part of the Opening sessions at "Reflections / Projections", the 15th ACM@UIUC Student Computing Conference at the University of Illinois Urbana-Champaign.

- They were joined by Googler and Python maintainer Alex Martelli, who spoke on "Python and the Programmer".

- Chris DiBona, head of the Open Source Programs Office at Google gave a keynote at AstriCon in Glendale, Arizona.

Currently:
- Earlier this week Leslie Hawthorn, manager of the Google Summer of Code program, was part of the amazing team that completed a new "Manual on GSoC Mentoring" in 2, count them, 2 DAYS, finishing up late last night. You will hear more about this feat in a later post after the...

- Google Summer of Code Mentor Summit 2009, being held in Mountain View this weekend, October 24th and 25th. This invitation-only gathering of mentors from each of the participating mentoring organizations in this year's GSoC gives the projects a chance to come together to compare notes on the mentoring process and cross-pollinate their projects. A good time promises to be had by all, and a full report will be forthcoming.

Coming up:
- Jonathan Blocksom will be speaking on Google App Engine and the All For Good project at the DC edition of Stack Overflow Dev Days, October 26th.

- On October 4th the LISA Conference in Baltimore, Maryland will feature a talk by Daniel Berlin and Joe Gregorio on the Google Wave Federation Protocol, the underlying open network protocol for sharing waves between wave providers. Interested attendees of LISA will be able to sign up for a developers Wave Sandbox Account. They will also have a chance to win Googley prizes at the Google Birds of a Feather session the next evening, hosted by Cat Allman and Tom Limoncelli.

TalkBack: An Open Source Screenreader For Android

Tuesday, October 20, 2009

Earlier this year, we blogged about project Eyes-Free — a collection of Android applications that enable efficient eyes-free interaction with your mobile phone. Since then, one of the questions we have received most often is about a complete access API to enable general purpose adaptive technologies such as screenreaders.

We are happy to announce the first version of such an API as part of the latest Android release (Donut). This new API is now available within the Android 1.6 SDK , and we welcome developer feedback. The Android Access framework generates android.view.accessibility.AccessibilityEvent
in response to user interaction events; the event payload contains additional details about the event, e.g., the user interface control that received focus. This access framework enables the creation of general purpose screenreading applications that make all of Android's user interface, as well as native Android applications built with standard Android widgets usable without looking at the screen.

You can see this API in use within our Open Source Android screenreader TalkBack. With TalkBack installed, standard Android user interface elements such as ListView produce spoken feedback during user interaction. Applications SoundBack (for producing non-spoken auditory feedback) and KickBack (for producing haptic feedback) generate additional augmentative output and demonstrate how multiple access applications can be active simultaneously.

in response to user interaction events; the event payload contains additional details about the event, e.g., the user interface control that received focus. This access framework enables the creation of general purpose screenreading applications that make all of Android's user interface, as well as native Android applications built with standard Android widgets usable without looking at the screen.

What This Means For Developers

If you are interested in developing innovative access solutions on Android and have been eagerly waiting for our access APIs, the Donut SDK contains what you have been waiting for — including a set of free voices for English (US and UK), French, Italian, German and Spanish. You can use TalkBack, SoundBack and KickBack as a starting point for designing your own access innovations.

If you are an Android developer interested in making your applications more widely usable, you can use TalkBack and friends to quickly verify whether your applications remain usable when not looking at the screen. In this context, here are a few coding tips to ensure that your applications work out of the box with these tools:
  1. Ensure that all visually drawn UI controls have meaningful textual labels.
  2. Ensure that users can navigate to all controls in your application using the trackball.
  3. Ensure that navigating controls in your application with the trackball results in a meaningful traversal order.
What This Means For End Users

End-users of Android 1.6 (Donut) can enable TalkBack, SoundBack and KickBack via the Accessibility section of the Settings menu. You need to do this only once i.e., once enabled, these access applications remain active across restarts. Note that depending on your Android device, you may need to install these applications from the Android Market; we will post videos that demonstrate step-by-step instructions for specific Android devices in the Eyes-Free channel on YouTube.

Providing Feedback

We (T. V. Raman, Charles L Chen, and Svetoslav Ganov) will be continuously improving the underlying APIs and access tools, and we look forward to your questions and feedback on the Android Developers Group.

Boldly Talking Python in Boulder

Friday, October 16, 2009

On Saturday, October 10, the Front Range Pythoneers had a Python "unconference" at the Google facilities in Boulder, Colorado, USA. An "Unconference" is a conference organized around the principles ofopen space technologies, which tries to provide many of the benefits of traditional conferences without the associated ceremony. We still got to enjoy some delicious pizza, though.


Introducing the Pycon Boulder Attendees to Principles of Open Space

Photo Credit: Matt Boersma

It was unseasonably snowy and cold Saturday morning, but in spite of the weather, almost everybody that signed up in advance was there, along with a few last-minute registrants. We had nearly 40 attendees join us for 15+ sessions, plus the always loved "hallway track." Many thanks to the three Googlers who came out to shepherd our group and facilitate the meeting.

You can find more information about the event and sessions on our wiki, Tweets about the event and this great post-conference write up. You can also check out some more photos of the participants and our scheduling process. We discussed the following topics, among others:
The best surprise of the event? Bruce Eckel, the author of Thinking in Java, was among the participants. Thanks again to Google for hosting the unconference; it worked really well for our purposes. The Google Boulder facility is gorgeous.

Fighting Bad Memories: The Stressful Application Test

Thursday, October 15, 2009

We've just released Stressful Application Test (or stressapptest), a hardware test used here at Google to test a large number of components in a machine. The test tries to maximize random traffic to memory from processor and disks with the intent of creating a realistic high load situation. The source code is available under the Apache license.

stressapptest may be used for various purposes:
  • Stress test for machines.

  • Hardware qualification and debugging.

  • Memory interface test.

  • Disk testing.



The stressapptest team (from left to right): Matthew Blecker, John Huang, Raphael Menderico, Nick Sanders, John Hawley and James Vera

Photo credit: Taral Joglekar


stressapptest is a user space test, primarily composed of threads doing memory copies and direct I/O disk read/write. Since many hardware issues reproduce infrequently, or only under corner cases, the idea behind the test is that by maximizing bus and memory traffic, the number of transactions is increased, and therefore the probability of failing a transaction is increased. It loads the memory with specially-designed patterns that cause the signal lines to rapidly switch between 1 and 0, drawing the maximum amount of power and cause maximal noise on the nearby voltage rails. Noise on voltage rails and coupling with other nearby lines is likely to cause signaling problems on marginal lines. Also, given a probability of any signal level transition failing, these patterns have the most memory transitions per period of time, and are thus more likely to exhibit a failure.

This test was designed to test all memory available on a machine, which is not guaranteed with the execution of a CPU-intensive application (for instance, compiling the kernel on multiple threads). Moreover, it is focused on testing the memory interface and connections, not the memory internally, like memtest86. As a consequence, Stressful Application Test will detect errors not detected by regular memory tests or extended executions. A comparison with some other memory reliability tests showed that about 20% of the DIMM-related failures detected on the machines tested were only detected by Stressful Application Test, and it was capable of reporting 70% of all DIMM errors detected by all tests.

We hope this software will be useful to system administrators who need to diagnose and repair DIMM or other components. We look forward to your questions and feedback in our discussion group. Happy hacking and may your testing be less stressful!

Testing Race Conditions in Java

Monday, October 12, 2009

Can you spot the bug in the following piece of Java code?


/** Maintains a list of names. */
public class NameManager {
  private List<String> names = new ArrayList<String>();
  /** Stores a new list of names. This method is threadsafe. */
  public void setNames(List<String> newNames) {
    synchronized (names) {
      names = new ArrayList<String>();
      for (String name : newNames) {
        names.add(name);
      }
    }
  }

(Hint: the method setNames() is synchronized on the names field, but that field is then modified to point to a new object.)

OK, so spotting the bug was easy. But how would you write a Unit Test to demonstrate the problem? You would need to have two or more threads calling setNames() simultaneously, but you still don't have any control over how the threads will be scheduled.

Enter Thread Weaver, a test framework that lets you control thread execution. By setting breakpoints in your code, you can stop one thread at exactly the point that you want, and then allow a second thread to run. This allows you to write repeatable multi-threaded unit tests, without relying on the thread scheduler.

Thread Weaver is released as an open source project under the Apache license, and is available on Google Code. Many examples can be found in the initial documentation. If you have comments or questions, please see our discussion group. Happy testing!

Ed. Note: Post updated with corrected formatting.

By Alasdair Mackintosh, Software Engineering Team

MoinMoin's Google Summer of Code Wrap Up

Friday, October 9, 2009

We at the MoinMoin Wiki software development team had a wonderful time with our participation in Google Summer of Code™ 2009. We greatly enjoyed collaborating with our students, hacking Python and Javascript code for the wiki engine. Thanks to Google's support, we had four student projects total, and three of them were successfully completed:

Christopher Denter, whom I mentored, worked on making MoinMoin's modular storage code production-ready by adding an access control middleware. Christopher's work in this area made MoinMoin safer and more flexible. He also worked on a router middleware - think of it as a kind of a wiki
"mount/fstab" - and a SQLAlchemy backend. Our users can now enjoy MoinMoin with MySQL, PostgreSQL, SQLite, etc. Christopher's work was done directly in the repo that will become the 2.0 release of MoinMoin.

Alexandre Martani, mentored by Bastian Blank, worked on a realtime collaborative wiki editor based on Google's mobwrite. Multiple people can now choose to edit the same wiki page at the same time and they all see each other's changes shortly after typing. We hope that we can merge his code into the MoinMoin 2.0 repository soon.

Dmitrijs Milajevs, mentored by Reimar Bauer, worked on groups and dictionary code with modular backends. You can now fetch group definitions from wiki pages or a wiki, and preparations have been made to make an LDAP group backend possible as part of future development. Dmitrijs also refactored the search code to get rid of the unmaintained xapwrap library and use the new xappy library. All his work has already merged into the MoinMoin 1.9 main repo.

Thanks also to Alexander Schremmer for his contributions as a mentor. Unfortunately, his student's project did not work out, but in true community fashion he provided valuable help and feedback for the other students.

In case you're curious about when all this nice code will be released:

MoinMoin 1.9 will be released later in 2009 (likely in November). Please help us beta testing, translating and generally making the release ready.

MoinMoin 2.0 will not just 1.9 + 0.1, but a major rewrite of big parts of the code base. Right now, it's like a big construction site, so it'll naturally take some time until the release will be ready, likely 2010 or 2011. We'd be happy to have your help with it; if you enjoy coding in Python, playing with new features, cleanly refactoring code and working with a fun team, then do join us to make MoinMoin an even better wiki. Check out the MoinMoin 2.0 page for more details.

Many thanks to all the students and mentors as well as everyone in the community who helped or supported the process. It was a very productive summer and we are greatly looking forward to continued work with our new contributors!

SIP Communicator's Summer of Code Adventures: Part Two

Tuesday, October 6, 2009

Ed. Note: You may recall that last week we published the first installment of Emil Ivov's report on SIP Communicator's participation in Google Summer of Code™. This week, Emil shares more of the project's 2009 success stories and lessons learned by the project over the past three instances of the program.

Geek Communicator

Linus Wallgren from Sweden completed a task that many of us have been dreaming about for a long time now: handling SIP Communicator entirely through the command line. So what exactly does this mean? Well, now, you can exit the application, hide or show it, send or receive messages, make or answer phone calls and open or close chats, entirely through the command line. So, you remember that super script that you always wanted to do? The one that sends a message to all your online friends at 3 o'clock every morning? You can now do it thanks to Linus! His work is going to be integrated into SIP Communicator some time this year so stay tuned!

Geek Communicator: Using SIP Communicator through the Console

Setting Your Own Avatar

Shashank Tyagi from India was accepted for the "Dude, checkout my photo!" project. His work consisted of making sure that it was possible for SIP Communicator users to upload a new photo/avatar with popular protocols like XMPP, MSN, Yahoo! Messenger, ICQ and AIM. He first started by exploring the mechanisms supported by the various protocol stacks that allowed this, discovering a few glitches on the way. He then worked on the glue that allows the SIP Communicator protocol modules to export this functionality to the rest of the application, and the GUI. Finally, with some help from his mentor, he also managed to wrap up a module that allowed users to take a picture of themselves using their webcam right before uploading it. Cool, isn't it?

Shashank's work is definitely going to get integrated into our trunk as soon as possible. However, until then you can either test it through his SVN branch or at least sneak a peek here:

Setting Your Own Avatar via SIP Communicator

DTMF with RTP

Romain Philibert
from France worked with us on the project "DTMF with RTP" which had the goal of providing an alternative transport for DTMF tones in audio RTP streams in addition to the existing SIP INFO method. The first phase of the development consisted of research on the possible approaches to solving the problem and the viability of each of the approaches was explored with proof-of-concept implementations. The second phase was the actual implementation of the chosen solution and involved refactoring existing source code to generalize it enough to also serve the goal of the project, employing the rearchitected design for the sake of sending and receiving DTMF tones as part of audio RTP streams, writing new UI to allow switching between the alternative DTMF transports, and creating unit tests to assure the correct operation of the functionality. Romain was exposed to communicating on our development mailing list where he reported his progress throughout the program, gathered feedback from members of our community and helped another contributor in resolving a common problem related to the unit tests. The source code he produced has been reviewed and currently awaits for a major redesign of the media service of our project to be finished in order to be updated and integrated into trunk.

DTMF with RTP

Impressive list, right? We're quite happy with it :)

So, let's now get to the final part and look through the three most important lessons we've learned throughout the past three years.

Lesson 1: You've Got to Have the Time

Google Summer of Code is a two-way process. Really! You take a lot so you have to be prepared to give as much. This is not a subcontracting deal where you could simply expect work to get done by itself because it is being paid for (not that this ever happens in subcontracting, anyway). Having a dedicated mentor for a student's project is almost as important as having a dedicated student. I've seen very few exceptions to this and it actually comes down to the following:
  1. We are dealing with students who are still learning. As eager as they are to get things done, most of them have little development experience. Therefore if left to themselves, students would tend to over-engineer, go for a dirty hack, overlook existing documentation, misunderstand the goal of the project or a bunch of other things that seem so natural to experienced project developers.

  2. Three months is hardly enough time even for experienced developers to fully grasp the internals of a mature project that they've never seen before. It is therefore naive to expect that a student would be able to come up with a usable and integratable contribution without a fair amount of guidance.
I am far from saying that you should be spoon feeding your student or do their work for them. To make things a bit more specific, I'd say that according to our experience a mentor should be ready to spend an average of about 45-60 minutes per day working with their student. Time is rarely equally spread across the summer. Our mentors would often find themselves spending up to two or three hours a day in the beginning of the program while 15 minute chats would be enough to resolve issues toward the end of the term.

Lesson 2: Less is More

I know, I know ... what a cliché! Still, it took us some time to actually realize it so I think it's important to note this lesson. I already mentioned that in 2008 we took 15 students and that this was not our best year. Mentoring resources were of course part of the issue. We had 4 of our most active developers take up two students each. First, this proved to be quite hard for the mentors themselves. Dedicating two hours a day to mentoring may turn out to be an issue when this is not part of your day job. Second, it was also a problem for the other students and their mentors. Given that our most active mentors had their hands full with their own students, they had little time to spare giving advice to other mentor-student pairs when they needed it. This turned out to be a blocking factor on more than one occasion and there was no one happy with the situation.

In addition to mentoring resources, a higher number of students are also hard to handle by the community itself. This means that people would be less aware of the progress of every project, there would be hence less interest, less encouragement, less acknowledgment and community integration for the students.

At the end of the day, we did manage to handle things and we only had a single failed project in 2008. However, the experience was far from the pleasing memories we had from 2007. It was therefore a good lesson to learn because taking less students was one of the main reasons for a successful 2009.

Lesson 3: One Committer per Student

I believe the one-mentor-for-one-student ratio is now commonly accepted practice for Google Summer of Code and that most projects are striving for it. We definitely have done our best to avoid mentor sharing since 2008. Having more than one (non-shared) mentor per student is even better but unfortunately not always possible. Another ratio that is just as important, and probably not that popular, is the number of committers involved as mentors. Code integration represents a significant part of the effort that projects spend over Google Summer of Code. It is quite obvious that if the developer committing the work of a particular student is also their mentor, integration is going to be a lot easier than if it were someone else.

For example, we had some very valuable stuff written during 2008, like support for proxies from Atul Aggarwal. Atul did a good job, but, his only mentor, despite being very technically savvy and knowing the project quite well, did not have committer access. Proxy support is quite important for SIP Communicator, although not necessarily critical. Committing Atul's work however, would require an existing committer to study all his work, and there always seems to be something in the critical path for development that must be reviewed. Things would have been a lot easier if one of the people that were expert in this field had been following the project right from the start.

We therefore decided to add pair each student with a committer in 2009, and each committer only had to take care of one student. The results were excellent, and as I already mentioned, we already have approx 30% of the GSoC code committed barely a week after the end of the program!

Lesson 4: Specific Tasks and Clear Conditions (Learning in Progress ...)

Ok ... this case is not really that straightforward and we have more learning to do before we really get it. Here's the problem:

In 2007 and 2008, we had a couple of students who would get to 50% or 60% of their work and then get distracted with unimportant stuff or simply disappear for a while. At a point their mentors would remind them that they have more to do and this would cause the students to feel uneasy, panic, or start arguing about things, such as:

"Oh, I didn't know I had to do unit testing!" or "I was never told this feature was part of the job!"

The statements weren't completely false. It could indeed happen that a task would seem obvious to a mentor and in the same time feel utterly unnatural to a student. In one case, it was actually the mentor who didn't request a task that was considered important by other community members.

So either way, in order to try and limit the surprises we decided that we needed to start every project with a list of clearly defined sub-tasks. This way, we thought, students would know exactly what they need to do and organize better. It would also help make sure that everyone on our side was well aware of the "official" project vision. Sounds neat, right?

Well, it didn't really work out that way.. Most of the students didn't have a problem with the new system, but then again, most of the students didn't have problems without it either. One of the students we failed, however, claimed the requirements list had been misleading and had made them believe they could plan a few weeks off. When we told them that this would be risky they complained it was too late to cancel the reservations, so they didn't listen ... and eventually failed.

So it appears that a list of what we believe to be specific requirements doesn't seem to change much in terms of understanding the goals of a particular project since there's always something that could be misunderstood. Clearly, continued mentor-student communication is crucial here but it seems that we'd also need explicit there-may-be-more-to-this-than-you-think notes.

Phew, that sums it all up! Hope the lessons we've learned above would help others in similar situations. Good luck to all of you future Google Summer of Coders!

Ed. note: Post title corrected.

Perls of Wisdom: The Perl Foundation & Parrot's Google Summer of Code

Monday, October 5, 2009

Google Summer of Code™ 2009 (GSoC) was filled with fresh faces and exciting new projects for The Perl Foundation (TPF). As I write this, we are currently in the final stage of the summer where students submit evidence of their work in a zip/tar.gz file by uploading it to a publicly viewable repository. I very much like that now anyone on the 'net can download a file containing the entire summer of work by the student, and there is even a download count next to each file for each student!

We started the summer with nine students, but Kevin Tew was not able to work on "A prototype LLVM JIT runcore for Parrot" in the program due to external issues. He is still a core Parrot Virtual Machine developer and I hope that he can find time to work on this awesome project some time in the near future. Since Parrot is currently redesigning its JIT framework from the ground up, a project similar to this would be great for next year.

The other Parrot VM project was Daniel Arbelo Arrocha working on "Decimal Arithmetic: BigInt, BigNum and BigRat for parrot" with Christoph Otto as a mentor. Big Decimal Arithmetic basically means storing arbitrary large/arbitrary precision numbers internally in a decimal format rather than the binary format usually used. Doing this can prevent catastrophic rounding errors. If you can store numbers internally with exactly NO error, then obviously this is A Very Good Thing. Daniel worked on making dynamic PMC's (Polymorphic Parrot Classes, or Parrot Magic cookies, take your pick) which wrap the mature and extensive IBM's libdecnumber library. What this means is that Parrot can do arithmetic on arbitrarily large integers (BigInt's) or floating point numbers with arbitrary precision (BigNum's.) Financial people are very interested in these as well, since no one wants to be short-changed on their interest due to rounding error. Daniel has also been contributing patches to many other parts of Parrot and will probably be getting a commit bit soon, which is great news to hear.

Devin Austin worked on "Refactoring Catalyst helper modules", with Kieren Diment as a mentor. This involved some "Moosification", which means refactoring home-rolled object-oriented code to use Moose (the post-modern Perl 5 object system), i.e. less code to maintain and more features at your fingertips.

I am excited to talk about our next student, who worked on a Perl 6-related project. Carl Mäsak mentored Hinrik Örn Sigurðsson on "Perl 6 End-User Documentation Tools" (github repo) . Hinrik is working on the grok command, which is the Perl 6 relative of perldoc. With it, you can get documentation for Perl 6 functions from the spec, read the Synopses and Apocalypses, and occasionally attain temporary enlightenment. If you have a properly installed CPAN client on your computer, you can install it with cpan App::Grok.

Our other exciting Perl 6 project was Paweł Murias working on "Multimethods for SMOP", mentored by Daniel Ruoso, the lead developer of SMOP. SMOP stands for Simple Meta Object Programming (or Simple Matter of Programming, if you are feeling snarky) and it is an implementation of Perl 6, a sister of Rakudo. Multimethods, short for Multiple Method Dispatch, is a feature where a language can determine which variant of a set of functions to call, based on the type of their arguments. One way that this becomes very powerful is that you can use wildcard arguments when you declare your multimethod, so you can essentially write many functions at once. Less code to maintain is a big WIN ! Paweł's code is being directly merged into the mainline SMOP codebase and from what I hear, he is and will continue to be a core contributor. That is what GSoC is all about. That and free t-shirts.



State Transitions in Mojo::Pipeline

Pascal Gaudette worked on "HTTP/1.1 Compliance Testing and User-Agent Development for the Mojo Web Framework" with mentor Viacheslav Tykhanovskyi. This is important because Mojo, one of the newest and most exciting Perl Web frameworks did not have much testing for acting correctly according to HTTP/1.1 . Part of the work of this summer has become the CPAN module MojoX::UserAgent. He has a great blog post about "State Transitions in Mojo" wherein he explains how Mojo deals with state and generated some pretty cool transition diagrams by documenting when a state transition happens in the test suite and then feeding this data into Graph::Easy.

WebKit has become often-forked and very influential open source browser engine, so it is no surprise that Perl hackers want bindings to it. Ryan Jendoubi worked on "Cross-platform Perl Bindings for wxWebKit" with mentor Michael Peters, which allows WebKit and WxWigdets, a cross platform GUI library, to talk to each other via wxWebKit. I think this was one of the more difficult projects this year, not because of the programming/algorithms involved, but because it requires getting lots of cross-platform, constantly-changing and fickle pieces of software to get along with each other. This is often like inviting zebras and lions to the same party. Messy and dangerous. But Ryan prevailed and we give him much respect and hope that he continues to maintain and improve the Wx::WebKit bindings.

I had the pleasure of mentoring Robert (Bob) Kuo this summer on his work entitled "Implement BPSW algorithm as a Perl 5 CPAN module, Math::Primality with extensive test-suite." The Math::Primality module is already on CPAN and I am glad to announce that Bob is listed as co-maintainer and published the latest release. This module is important because the Perl 5 Cryptography CPAN modules (mostly in the Crypt::* namespace) have one, very large, very fragile dependency, called Math::Pari. Math::Pari is an amazing library that gives Perl access to the extensive state-of-the-art number theory library PARI, but to attain ultimate speed, Math::Pari pokes into undocumented internal-only Perl core internals, which means that changes to Perl internals that *shouldn't* have any effect on the outside world can cause the entire Crypt::* namespace to break. Also, most of the Crypt::* namespace require only 5-10 functions from Math::Pari, which provides an interface to thousands of functions. Math::Primality implements the few prime-checking (primality) functions that Crypt::* modules want from Math::Pari in a small, easy-to-maintain, pure-Perl CPAN module. Bob implemented the BPSW algorithm, a state of the art prime-number checking algorithm which allows you to check if an arbitrarily large number is prime in O( log(n) ) i.e. logarithmic running time. It is actually a combination of very different prime number checks, which weed out different types of non-prime numbers. So far, no one has found a counter-example to the BPSW algorithm (even though those pesky mathematicians say there probably are), so it is the best out there currently. It is estimated that because no one has seen this algorithm fail yet, and it being used extensively from within other algorithms, that the first counter-example must be at least 10,000 decimal digits long! Future steps for this module will be to work on its sister module, Math::Factoring, which implements the remaining factoring-related functions that Crypto modules want and then use both modules as new dependencies for the Crypt:: namespace, instead of Math::Pari.

Justin Hunter was mentored by Ash Berlin on the project "SQL::Translator Rewrite" (github repo). SQL::Translator is a very popular CPAN module for translating various "flavors" of SQL to and from each other, such as Postgres to MySQL. This involved more "Moosification", as described above. Justin also has some advice for hopeful GSoC students:
  • Get over yourself.

  • Understand there are people smarter than you or people better at some things than you.

  • Just the same, you're still needed.

  • Just Go Ahead and Do It and find your niche.

I wish someone had told me that about 10 years ago. I would not have wasted a lot of time worrying that people would think I was stupid and starting diving into Open Source projects much more earnestly.

In total, we had a success rate of 8/9 = 88.8%, just a bit above this year's all time high of 85%. Please join me in congratulating all these students and mentors for their top-notch work ! I can honestly say that I had much fun interacting with so many corners of the Perl community. This was my first year being an organization administrator, I learned a lot by my favorite method: sink or swim. I was handed over the magic rocket-powered surf board by Eric Wilhelm after successfully mentoring Thierry Moisan last year on the Math::GSL CPAN module. Organizing and communicating with people spread across a dozen time zones is definitely an art that I am still mastering. I think using as many mediums as possible to communication with people is key. I already use chat, email and IRC, but I wish I had done voice/skype and/or video chat with some of my students and mentors, so that everyone has a face to attach to a name.

I would also like to thank Jerry Gay for being The Perl Foundation co-pilot this year and welcoming me into the Parrot community. Your guidance in certain matters went a long way.

For anyone that would like to help/be involved in Google Summer of Code with The Perl Foundation next year, we cordially invite you to our IRC channel, #soc-help on irc.perl.org, and the mailing list.

SVG at Google and in Internet Explorer

Friday, October 2, 2009

At Google we are doing some exciting work with SVG, including hosting the SVG Open conference, helping SVG to work on Internet Explorer, and working with Wikipedia. Make sure to check out the Google Code Blog for all the details!

Report from EuroBSDCon 2009

I'm no stranger to EuroBSDCon. After attending several very successful conferences in the US, three FreeBSD contributors and I decided that Europe needed a BSD conference too. In November 2001 we were proud to host 160 or so delegates in the first European BSD Conference. Over the last couple of years I haven't been able to keep as up to date with the latest developments in the BSD world, so I was very interested to attend EuroBSDCon 2009, organised in collaboration with the UK Unix User Group.

With the conference split in to several tracks it was impossible to attend every talk, so I decided to focus primarily on those that talked about how BSD systems were helping people solve problems in the real world. Links to all the papers, slides, and in some cases audio from the presentations can be found at conference schedule page.

The first talk I attended was "How FreeBSD Finds Oil," given by Harrison Grundy. Harrison runs a consultancy company in the US providing clustered computing systems to oil and gas companies.

From EuroBSDCon 2009

He started with a run through of the economics of oil and gas exploration. It quickly became clear that the "Free" in "FreeBSD" is of no concern to these companies, as software licensing costs are such a tiny part of their overall expense. Features like stability and performance are far more important -- his customers frequently run lengthy computational jobs over terabytes of data. This is somewhat similar to what we do at Google, although obviously the data is very different. I asked whether the industry was moving in the direction of technologies like Hadoop (an open source implementation of MapReduce) but for the moment, at least, the answer seems to be no. "It's not broken, so why fix it?" appearing to be the prevailing view.

Next was Konrad Heuer, talking about "FreeBSD in a complex environment." Here he described some lessons learned from running a heterogeneous environment of systems - FreeBSD, Linux, Windows, Solaris, and issues that they faced and benefits they saw with FreeBSD. Chief amongst those benefits seemed to be the commitment by the project to continue to support APIs and higher level interfaces. Their print services have run on FreeBSD for more than 10 years, with very few modifications required. The biggest issue seemed to be commercial support; he described a number of hacks required to be able to use Tivoli Storage Manager (which they use on their other systems) to also back up their FreeBSD systems. In the discussion that followed there was a suggestion to create a mechanism where people could register things like this, so that vendors realise that many of their Linux sales are actually BSD sales, and have more incentive to create a native version of the application.

Peter Losher from the Internet Systems Consortium presented next. The ISC is a non-profit organisation that has, for years, developed, or funded the development of much of the core software the "runs" the Internet, including the DNS server BIND, and DHCP server software. ISC also provides hosting, connectivity, and mirroring services for several open source projects, including many of the BSDs. Peter talked in some detail about the mechanisms used to make the F root DNS server highly available, and features in FreeBSD that make this possible. He also talked a little about IPv6, and new features in DHCP v4.x to support IPv61.

After the lunch break Kirk McKusick talked about Superpage support in FreeBSD. He was quick to point out that he'd had no hand in the work himself, and was describing work carried out by Alan Cox et al as a result of their 2002 paper on superpages. Superpages are a method for solving a bottleneck in modern architectures.

From EuroBSDCon 2009

Right at the top of the memory access hierarchy is the Translation Lookaside Buffer, or TLB. The TLB is used to cache the mapping between page virtual addresses to page physical addresses, but has not grown in size at the same rate as available main memory has grown. A common maximum size is 1MB, which, when your page size is 4K, only allows for 4MB worth of virtual addresses to be in the TLB at any one time. With high-end systems these days approaching 32 or even 64GB of RAM, and typical working set sizes being much higher than 4MB the TLB undergoes significant churn. The solution is to use a page size that's larger than 4K -- a superpage. Some architectures have support for many different page sizes. The i386 architecture however is limited to either 4K or 2MB. A 2MB page size would allow the TLB to cache mappings for 2GB of RAM, and provide a large speed improvement to any program that processes a significant amount of data. Kirk went on to describe the work that Alan and others have done to implement superpage support on FreeBSD, and the heuristics the system uses to determine whether to collapse a 2MB contiguous chunk of RAM in to a superpage. He presented benchmark results that show superpages providing somewhere between a ~ 15 - 600% (!) speed improvement under typical workloads.

The next session was Chris Buechler giving an introduction to pfsense, and an overview of what new features will be in the upcoming 2.0 releases. pfsense is a FreeBSD distribution designed to run as an embedded firewall or router, although that description barely covers its capabilities. Amongst other things the 2.0 code includes is a major overhaul of the configuration UI, generalisation of interface support so that pfsense now works with any number of interfaces rather than 3, numerous new networking technologies, and an easy way to provide additional functionality via packages instead of bloating the base system.

Stanislav Sedov then described work that he had undertaken to build an embedded GPS navigation and tracking device designed to be deployed in harsh industrial environments. This included porting FreeBSD to an Atmel AT91RM9200 CPU, improving the device's bootloader support so it could boot from UFS, reducing the size of the image, and providing support for reliable in-the-field upgrades.

The final session of the day was an invited talk from Dr. Richard Clayton on the theme of "Evil on the Internet". I was fortunate enough to see Dr. Clayton give a version of this talk at Google about 18 months ago. Since then he's updated it to cover more examples of how people are using the internet to phish, scam, defraud, and otherwise attempt to part people from their money.

Sunday I joined the first session of the second track, an introduction to mfsBSD, a toolset to create memory filesystem based FreeBSD distributions. Martin Matuška explained his motivations behind the project, which was to find an easy way to replace Linux in the ISP-hosted environment he was using, but mfsBSD can now be used to make upgrades easier, provide a rescue partition, a USB bootable install of FreeBSD, and so on.

From EuroBSDCon 2009

Next was Brooks Davis, presenting a roundup of the results of FreeBSD's participation in the 2009 Google Summer of Code. Of the 20 FreeBSD projects that were accepted as part of Summer of Code, 17 were successful. These included efforts that improved the performance of the ipfw firewall code, introduced support for stackable cryptographic filesystems, and enhanced the infrastructure for tracking software licenses in the ports tree, making it easier for users and distributors to ensure that they are using software that complies with their local licensing requirements.

From EuroBSDCon 2009

Alastair Crooks followed this with a discussion of his work on netPGP, a BSD licensed implementation of PGP that is configuration-compatible with gnuPG. As well as covering the ins and outs of the work Alastair's presentation was notable for employing some truly terrible (but memorable) visual puns. I was groaning too much through them to take pictures, but if I tell you that the slide titled "Use Cases" had as the accompanying illustration a picture of some sheep next to some hat boxes you might get an idea. Must have worked though, since I can still remember the slides.

Kris Moore from iXSystems then demonstrated the work that they've been doing on the PC-BSD distribution of FreeBSD. Apart from making the installation process considerably simpler and improving the initial user experience they've also developed an alternative binary package mechanism, which they call PBI. The PBI format works to avoid problems caused by upgrades to shared libraries that should be backwards compatible but aren't, and does this by bundling a copy of all the shared libraries required by the application in to the package directory, making each installed package completely self-contained and upgradeable without interfering with any other applications that are installed.

The "state of BSD" sessions at these conferences are always entertaining, and this year was no exception. Alastair Crooks for NetBSD, Owain Ainsworth and Henning Brauer for OpenBSD, and George Neville-Neil for FreeBSD presented updates on the current state and future plans of each of these systems.

From EuroBSDCon 2009

EuroBSDCon concluded with a number of lightning talks covering various works in progress (or WIPs), both large and small. The most interesting, for me, was the update by Pawel Jakub Dawidek on the state of ZFS support in FreeBSD. This is something that was just coming to FreeBSD around the time that I was running out of time to pay attention on a day-to-day basis. Since then support for ZFS has improved tremendously, and probably the comment I heard most repeatedly at the conference was how useful people are finding it.

And with that, the conference closed. Organizers were thanked, and delegates prepared themselves for the journey home.

1 The irony of v4 of the software being the first to support IPv6 is not lost on them.

By Nik Clayton, Site Reliability Engineer


.