Cell - Julio Merino (jmmv.dev)

Past days' work

Been tracking and resolving a bug in Linux's SPU scheduler for the last three days, and fixed it just a moment ago! I'm happy and needed to mention this ;-)

More specifically, tracking it down was fairly easy using SystemTap and Paraver (getting the two to play well together was another source of headaches), but fixing it was the most complex thing due to deadlocks popping up over and over again. Sorry, can't disclose more information about it yet; want to think a bit more how to make this public and whether my fix is really OK or not. But be sure I will!

December 7, 2007 · Tags: cell, linux
Continue reading (about 1 minute)

Thanks, SystemTap!

I started this week's work with the idea of instrumenting the spufs module found in Linux/Cell to be able to take some traces of the execution of Cell applications. At first, I modified that module to emit events at certain key points, which were later registered in a circular queue. Then, I implemented a file in /proc so that a user-space application could read from it and free space from the queue to prevent the loss of events when it was full.

That first implementation never worked well, but as I liked how it was evolving, I thought it could be a neat idea to make this "framework" more generic so that other parts of the kernel could use it. I rewrote everything with this idea in mind and then also modified the regular scheduler and the process-management system calls to also rise events for my trace. And got it working.

But then, I was talking to Brainstorm about his new "Sun Campus Ambassador" position at the University, and during the conversation he mentioned DTrace. So I asked... "Mmm, that tool could probably simplify all my work; is it there something similar for Linux?". And yes; yes it is! Its name, SystemTap.

As the web page says, SystemTap "provides an infrastructure to simplify the gathering of information about the running Linux system". You do this by writing small scripts that hook into specific points of the kernel — at the function level, at specific mark points, etc. — and which get executed when the script is processed and installed into the live kernel as a loadable kernel module.

With this tool I can discard my several-hundred-long changes to gather traces and replace them with some very, very simple SystemTap scripts. No need to rebuild the kernel, no need to deal with custom changes to it, no need to rebuild every now and then... neat!

Now I'm having problems using the feature that allows to instrument kernel markers, and I need them because otherwise some private functions cannot be instrumented due to compiler optimizations (I think). OK, I'd expose those functions, but while I'm at it, I think it'd be a good idea to write a decent tapset for spufs that could later be published. And that prevents me from doing such hacks.

But anyway, kudos to the SystemTap developers. I now understand why everybody is so excited about DTrace.

December 2, 2007 · Tags: cell, linux, systemtap
Continue reading (about 2 minutes)

Hello world in Linux/ppc64

I'm decided to improve my knowledge on the Cell platform, and the best way to get started seems to be to learn 64-bit PowerPC assembly given that the PPU uses this instruction set. Learning this will open the door to do some more interesting tricks with the architecture's low-level details.

There are some excellent articles at IBM developerWorks dealing with this subject, and thanks to the first one in an introductory series to PPC64 I've been able to write the typical hello world program :-)

Without further ado, here is the code!

#
# The program's static data
#

.data

msg:    .string "Hello, world!n"
        length = . - msg

#
# Special section needed by the linker due to the C calling
# conventions in this platform.
#

.section ".opd", "aw"               # aw = allocatable/writable

.global _start
_start:
        .quad ._start, .TOC.@tocbase, 0

#
# The program's code
#

.text

._start:
        li      0, 4                # write(2)
        li      3, 1                # stdout file descriptor
        lis     4, msg@highest      # load 64-bit buffer address
        ori     4, 4, msg@higher
        rldicr  4, 4, 32, 31
        oris    4, 4, msg@h
        ori     4, 4, msg@l
        li      5, length           # buffer length
        sc

        li      0, 1                # _exit(2)
        li      3, 0                # return success
        sc

You can build it with the following commands:

$ as -a64 -o hello.o hello.s
$ ld -melf64ppc -o hello hello.o

I'm curious about as(1)'s -a option; its purpose is pretty obvious, but it is not documented anywhere in the manual page nor in the info files.

Anyway, back to coding! I guess I'll post more about this subject if I find interesting and/or non-obvious things that are not already documented clearly anywhere. But for beginner's stuff you already have the articles linked above.

November 25, 2007 · Tags: cell, linux, powerpc
Continue reading (about 2 minutes)

Mad at the Cell SDK

I've been installing the Cell SDK 3.0 on two Fedora 8 systems at home — a PlayStation 3 and an old AMD box — and I cannot understand how someone (IBM and BSC) can publish such an utterly broken piece of crap and be proud of it. Sorry, had to say it. (If you are one of those who wrote the installer, please excuse me, but that's what I really think. Take this as a constructive criticism.)

Before saying that Fedora 8 is not supported and that I should only run this on Fedora 7, shut up. I am sure all the problems are there too, because none of them can be related to the system version.

Strictly speaking, it is not that the installer does not work, because if you follow the instructions it does. But it is a very strange program that leaves garbage all around your system, produces warning messages during execution and the garbage left around will keep producing warnings indefinitely. Plus, to make things worse, the network connection to the BSC — where the free software packages are downloaded from by yum — is extremely unreliable from outside the university's direct connection (that is, from home), which means that you will have to retry the installation lots of times until you are able to download all the huge packages. (In fact, that's what annoys me most.) And this is not a problem that happened today only; it also bit me half a year ago when installing the 2.1 version.

Let's talk about the installer, that marvelous application.

Starting with version 3, the SDK is composed of a RPM package called cell-install and two ISO images (Devel and Extras). When I saw that, I was pretty happy because I thought that, with the RPM package alone, I'd be able to do all the installation without having to deal with ISO images. It turns out that that is not true, as some components only seem to be available from within them (most likely the non-free ones, but I haven't paid attention).

Ah, you want to know what the SDK contains. Basically, it is composed of a free GCC-based toolchain for both the PPU and SPUs, the free run-time environment (the libspe2), a proprietary toolchain, the proprietary IBM SystemSim for Cell simulator and some other tools (a mixture of free and non-free ones). So, as you can see, we have some free components and some proprietary ones. You can, in fact, develop for the Cell architecture by using the free components alone. So why on earth do you need the proprietary ones? Why can't you skip them? Why aren't they available in some nice repository that I can use without any external "installer" and avoid such crap? That's something I don't get. (Maybe it's possible with some extra effort, but not what the instructions tell you.)

OK, back to the installer. So you need to copy the two ISOs in a temporary directory, say /tmp/iso and then run the installer by doing something like:

# cd /opt/cell
# ./cellsdk --iso /tmp/dir install

This will first proceed to show you some license agreements. Here is one funny point: you must accept the GPL and LGPL terms. Come on! I am using Fedora, and I am already using lots of GPL'd components for which I did not see the license. Why do I have to do that? And why do I have to reaccept the IBM license terms when I already did that in the downloads page?

After the license thing, it mounts the images under /tmp/sdk (keep this in mind because we'll get back to it later), probably does some black magic and at last launches yum groupinstall with multiple parameters to install all the SDK components. All right, you accept the installation details and it starts installing stuff. This would be OK if it wasn't for the network connection problems I mentioned earlier; I've had to restart this part dozens of times (literally) to be able to get all packages. So, again, question: why couldn't you simply tell me what to put in yum's configuration, define some installation groups for the free components alone and let me use yum to install those without having to trust some foreign and crappy installer script? Why do you insist on using /opt for some components and uninconsistently between architectures?

And why did I mention /tmp/sdk? Because the yum repositories registered by the installer have this location hardcoded in. Once you unmount the ISO images (that is, when the installation is done), yum will keep complaining about missing files in /tmp/sdk forever — unless you manually change yum's configuration, that is. What is nicer, though, is that yum always complains about one specific repository because it is only available online, yet it also looks for a corresponding image in /tmp/sdk.

At last, there are also some random problems (probably caused by all the above inconsistencies). Once, the script finalized successfully but the SDK was left half installed: some components were missing. Another time, the installer hang in the middle (no CPU consuption at all, no system activity) when it seemed it had finished and had to manually kill it. After restarting it, it effectively had not finished as it had to install some more stuff.

Summarizing... all these problems may not be so important, but they make one feel that the whole SDK is a very clunky thing.

I wish someone could create native packages for the free components of the SDK and import them into the official Fedora (or, please, please, please, Debian) repositories. After all, these are just a native compiler for the PPU, a cross-compiler for the SPUs, the libspe2 and the SPU's newlib. Note that the GCC backends for both the PPU and SPUs are already part of the FSF trees, so it shouldn't be too difficult to achieve by using official, nicer sources.

Rant time over.

November 18, 2007 · Tags: cell, fedora
Continue reading (about 5 minutes)

PFC report almost ready

The deadline for my PFC (the project that will conclude my computer science degree) is approaching. I have to hand out the final report next week and present the project on July 6th. Its title is "Efficient resource management in heterogeneous multiprocessor systems" and its basic goal is to inspect the poor management of such machines in current operating systems and how this situation could be improved in the future.

Our specific case study has been the Cell processor, the PlayStation 3 and Linux, as these form a clear example of an heterogeneous multiprocessor system that may become widespread due to its relatively cheap price and the attractive features (gaming, multimedia playback, etc.) it provides to a "home user".

Most of the project has been an analysis of the current state of the art and the proposal of ideas at an abstract level. Due to timing constraints and the complexity of the subject (should I also mention bad planning?), I have been unable to implement most of them even though I wanted to do so at the very beginning. The code I've done is so crappy that I won't be sharing it anytime soon, but if there is interest I might clean it up (I mean, rewrite it from the ground up) and publish it to a wider audience.

Anyway, to the real point of this post. I've published an almost definitive copy of the final report so that you can take a look at it if you want to. I will certainly welcome any comments you have, be it mentioning bugs, typos, wrong explanOctations or anything! Feel free to post them as comments here or to send me a mail, but do so before next Monday as that's the deadline for printing. Many thanks in advance if you take the time to do a quick review!

(And yes... this means I'll be completely free from now on to work on my SoC project, which is being delayed too much already...)

Edit (Oct 17th): Moved the report in the server; fixed the link here.

June 19, 2007 · Tags: cell, linux, pfc, ps3
Continue reading (about 2 minutes)

Building the libspe2 on the PS3

The Linux kernel, when built for a Cell-based platform, provides the spufs pseudo-file system that allows userland applications to interact with the Synergistic Processing Engines (SPEs). However, this interface is too low-level to be useful for application-level programs and hence another level of abstraction is provided over it through the libspe library.

There are two versions of the libspe:

1.x: Distributed as part of the Cell SDK 2.0, is the most widely used nowadays by applications designed to run on the Cell architecture.
2.x: A rewrite of the library that provides a better and cleaner interface — e.g. less black boxes —, but which is currently distributed for evaluation and testing purposes. Further development will happen on this version, so I needed to have it available.

The YellowDog Linux 5.0 (YDL5) distribution for the PlayStation 3 only provides an SRPM package for the 1.x version, but there is no support for 2.x. Fortunately, installing the libspe2 is trivial if you use the appropriate binary packages provided by BSC, but things get interesting if you try to build it from sources. As I need to inspect its code and do some changes in it, I have to be able to rebuild its code, so I had to go with the latter option.

Let's see how to build and install libspe2 from sources on a PS3 running YDL5.

The first step is to download the most up-to-date SRPM package for the libspe2, which at the time of this writing was libspe2-2.0.1-1.src.rpm. Once downloaded, install it on the system:

# rpm -i libspe2-2.0.1-1.src.rpm

The above command leaves the original source tarball, any necessary patches and the spec file all properly laid out inside the /usr/src/yellowdog hierarchy.

Now, before we can build the libspe2 package, we need to fulfill two requisites. The first is the installation of quilt (for which no binary package exists in the YDL5 repositories), a required tool in libspe2's build process. The second is the updating of bash to a newer version, as the one distributed in YDL5 has a quoting bug that prevents quilt from being built properly.

The easiest way to solve these problems is to look for the corresponding SRPM packages for quilt and an updated bash. As YDL5 is based on Fedora Core, a safe bet is to fetch the necessary files from the Fedora Core 6 (FC6) repositories; these were: quilt-0.46-1.fc6.src.rpm and bash-3.1-16.1.src.rpm. After that, proceed with their installation as shown above for libspe2 (using rpm -i).

With all the sources in place, it is time to build and install them in the right order. Luckily the FC6 SRPMs we need work fine in YDL5, but this might not be true for other packages. Here is what to do:

# cd /usr/src/yellowdog/SRPMS # rpmbuild -ba --target=ppc bash.spec # rpm -U ../RPMS/ppc/bash-3.1-16.1.ppc.rpm # rpmbuild -ba --target=ppc quilt.spec # rpm -i ../RPMS/ppc/quilt-0.46-1.ppc.rpm # rpmbuild -ba libspe2.spec # rpm -i ../RPMS/ppc64/libspe2-2.0.1-1.ppc64.rpm # rpm -i ../RPMS/ppc64/libspe2-devel-2.0.1-1.ppc64.rpm

And that's it! libspe2 is now installed and ready to be used. Of course, with the build requisites in place, you compile libspe2 in your home directory for testing purposes by using the tar.gz package instead of the SRPM.

At last, complete the installation by adding the elfspe2-2.0.1-1.ppc.rpm package to the mix.

March 14, 2007 · Tags: cell, pfc, ps3, yellowdog
Continue reading (about 3 minutes)

PFC subject chosen

A while ago, I was doubtful about the subject of my undergraduate thesis (or PFC as we call it). At first, I wanted to work on a regression testing framework for NetBSD. This is something I really want to see done and I'd work on it if I had enough free time now... Unfortunately, it didn't fit quite well my expectations for the PFC: it was a project not related at all with the current research subjects in my faculty, hence it was not appropriate enough to integrate into one of these work groups.

So, after inverstigating some of the projects proposed by these research groups, I've finally settled on one that revolves around heterogeneous multiprocessor systems such as the Cell Broadband Engine. The resulting code will be based on Linux as it is the main (only?) platform for Cell development, but the concepts should still be applicable to other systems. Who knows, maybe I'll end up trying to port NetBSD to a Cell machine — shouldn't be too hard if that G5 support is integrated ;-)

The preliminary title: Efficient resource management in heterogeneous multiprocessor systems. For more details, check out the Project proposal (still not concreted, as you can see).

January 27, 2007 · Tags: cell, pfc
Continue reading (about 1 minute)

Posts: Cell