Today, I installed NexentaOS on a Grid5000 cluster node. NexentaOS is basically Debian GNU/kOpenSolaris (Debian userland, with an OpenSolaris kernel. APT repository here). It works very well (good hardware support & detection, nice GNOME desktop). And I got the chance to play with DTrace.
Before writing this blog entry, I was considering writing a LinuxFR article about DTrace, when I came across this LinuxFR “journal” about Solaris 10, which gave me a good laugh. The best part is :
– DTrace : Sorte de surcouche de strace/ltrace. C’est peu intéressant. En démonstration, un type utilise DTrace pour “découvrir” que lancer une xterm écrit dans ~/.bash_history. C’est presque comique. (In approximate english: DTrace : a sort of layer above ltrace/strace. Not really interesting. In the demo, the developer used DTrace to “discover” that running xterm writes in ~/.bash_history. It’s nearly funny.)
It’s funny how people continue to compare DTrace to strace. It’s like comparing the GNOME project with fvwm. Yeah, both of them can display windows, and there are still people thinking that fvwm is enough for everybody.
OK, back to DTrace. DTrace is a tracing framework, which allows a consumer application (generally a D script) to register with some DTrace providers (probes) and get some data. This nice graph from the DTrace howto explains it much better than I do :
Most system monitoring tools on Linux use polling : they retrieve some data from the system at regular intervals (think of top, “vmstat 1”, …). DTrace changes this and uses push instead, which allows to monitor events that you wouldn’t notice on Linux. It also allows to monitor much more stuff than current Linux tools, in a very easy and clean way.
With DTrace, you can monitor a lot of stuff and find the answer to a lot of questions, like :
- Monitor process creation, even the short ones. The execsnoop script (which would be a one-liner if you remove the output formatting, and is available in the DTrace Toolkit) shows that logging in by ssh and running
for i in $(seq 1 3); do /bin/echo $i; doneruns the following processes :
0 1172 1171 sh -c /usr/bin/locale -a 0 1172 1171 /usr/bin/locale -a 0 1175 1173 -bash 0 1177 1176 id -u 0 1179 1178 dircolors -b 0 1174 1173 pt_chmod 9 0 1180 1175 mesg n 0 1181 1175 seq 1 3 0 1182 1175 /bin/echo 1 0 1183 1175 /bin/echo 2 0 1184 1175 /bin/echo 3
- Monitor user and library function calls, and profile them like gprof (yeah, DTrace can replace gprof)
- Monitor system calls for the whole system or a specific app (yeah, DTrace can replace strace, but you already knew that ;). And you don’t need to restart the app before monitoring it.
- Replace vmstat. Of course, you can also get the usual vmstat results, but only for events caused by a specific process.
- Mesure the average latency between GET requests and their result when you browse the web using mozilla. DTrace does this by monitor write syscalls issued by your browser containing a GET and mesuring the delay before the subsequent
- Monitor all
opensyscalls issued on the whole system
- Monitor all TCP connections received by the system
- Analyze disk I/O : how much data was written/read to/from the disk, by which process. Nice way to understand buffering and I/O scheduling.
Another example combining rwsnoop and iosnoop :
- I start rwsnoop -n bash (read/write monitor, only on processes named bash) and iosnoop -m / (I/O monitor, only on the root partition)
- I run :
echo blop > t
- rwsnoop shows all the read/write calls issued to write to my pseudo-terminal, and the write call to /root/t :
UID PID CMD D BYTES FILE [...] 0 1175 bash R 1 /devices/pseudo/pts@0:1 0 1175 bash W 1 /devices/pseudo/pts@0:1 0 1175 bash W 5 /root/t 0 1175 bash W 32 /devices/pseudo/pts@0:1 0 1175 bash W 17 /devices/pseudo/pts@0:1
- iosnoop doesn’t display anything. But when I run
UID PID D BLOCK SIZE COMM PATHNAME 0 1264 W 96640 1024 sync /root/t
- We can see that the write is buffered in the kernel. And that I the 5 chars to my file were transformed in 1024 bytes written to the disk. (Question for the reader: why 5 ? Yeah, it’s easy)
Short conclusion: DTrace looks fantastic. As a toy, it allows to demonstrate/understand the inner workings of Solaris. As a tool, it can probably provide A LOT of useful info, especially since writing DTrace providers seems quite easy (Ruby provider, PHP provider.
Second short conclusion for those who really haven’t understand anything (some LinuxFR readers ;-) : as you can see, when you run
echo blop > t, “blop” is actually written to disk in /root/t. Fabulous, isn’t it ?