David Brown’s Blog

David Brown’s Blog

David Brown  //  Software engineer/Jazz musician.

Jul 28 / 11:42am

Re-learning C++

I first learned C++ somewhere around 1990, as a student, in a programming course. The main thing I remember from this time was that CFront at the time used a non-C++-knowing preprocessor, which occasionally caused weird problems with syntax errors in comments.

About 6 years ago, I tried using C++ again to build a unit test framework for a filesystem I was writing. It was probably a bad idea to use an important project as a place to learn C++, and I underestimated how much the language had changed. I ended up writing the test in OCaml, which helped with the type safety. I did spend a lot of time writing bindings to the C code I was testing, and it turned out to be a disadvantage to other members of the team, who hadn’t ever worked with ML languages, let alone functional languages.

About a month ago, I decided to actually learn a modern version of C++.

Changes

The changes I mainly had to learn where:

  • Namespaces
  • Multiple inheritance
  • Exceptions
  • Templates
  • The STL

Of these, the STL probably took the most time to learn. The other mechanisms are largely used in similar ways in other languages, although the specifics of multiple inheritance in C++ are different than other languages I’ve used, they seem to be coherent.

Misconceptions

I had some fairly serious misconceptions of C++ that I am glad to have been able to get past.

Strong typing

C++ is a fairly strongly-typed language, much more so than C. Although it allows C-style casting, it offers plenty of mechanisms to not require it, and the compiler asserts fairly strict type coherency.

What had confused me is that the template mechanism does not enforce type constraints. Languages such as Ada, or Scala have strongly typed generics (Java is less so), which requires a fairly rich type system, and also tends to force type relationships when they aren’t completely necessary (if a generic wants to use a feature of the type parameter, that parameter must be restricted to a type that supports that feature). Templates, on the other hand, are resolved at each instantiation. This is more flexible, still just as strongly typed, but tends to produce amazingly poor error messages.

Garbage collection

C++ does pretty much shun garbage collection. Although Boehm can be bolted on to a C++ app, it isn’t a style of programming that is commonly used in C++.

However, C++ makes up for this by providing full control over construction and destruction of objects. This allows for full memory management. The disadvantage is that it is harder to manage sharing of objects, which that usually requires some type of smart pointer that does reference counting. There does tend to be a bit more copying of objects with the C++ style, and I’m not completely sure of the tradeoffs between the extra copying, and the extra work of a garbage collector. It likely depends on the particular application.

Binding to C

Interfacing to C code and libraries is clearly where C++ wins over pretty much any other higher-level language. I suspect this is the major reason for the success of C++. The kind of programming I usually end up doing (systems type, such as backups) requires me to bind to system-level calls for I/O and such. In C++, I am able to call these functions directly, without having to worry about structure formats and calling conventions.

Conclusion

I’m not certain how much I’ll be using C++ for programming projects. I’ve started rewriting some small parts of JPool in C++. It is a good exercise in learning the language, but I still have quite a bit of learning to do before I can determine if it would be efficient language for writing code. I still like Scala quite a bit, and will definitely keep my Scala version of the backup software alive. Perhaps I will create a restore utility written in C++ to make it easier to restore from a rescue disk.

There still seems to be a lot of crappy C++ code out there. This is probably mostly because of the popularity of the language, not any inherit feature of it. I shouldn’t let the poor code I have seen deter me away from what benefits the language might offer.

Loading mentions Retweet

Filed under // c++ languages

Comments (0)

Jul 28 / 10:58am

Markdown support in posterous

It’s been a while since I’ve written anything up in my blog, mostly because I’ve been busy with work and other real-world things. But, now that Posterous supports Markdown as a formatting for blog postings, most of my barriers to writing posts should be gone. Part of the purpose of this post is to make sure this actually works.

Loading mentions Retweet

Comments (0)

May 2 / 10:56pm

A brief review of the new MacBook Pro

I got my new MacBook Pro 15-inch on Friday, several days earlier than I was expecting. Overall, I have to say that I'm quite happy with it, although Friday evening was a bit on the frustrating side. I learned the hard way that this new machine is not compatible with my old Netgear WGR614 router. It connects, but the connection is unreliable, drops a lot of packets, and makes for an otherwise unpleasant experience. I went to the Apple store Saturday morning, and picked up an AirPort Extreme, which was easy to set up, and works quite well.

I special ordered the machine with the higher-resolution screen (non-glossy, I'll get to that in a moment), and a 256GB SSD. I haven't really decided if the SSD is really worth the $600, but it certainly is nice. It boots almost instantly, and operations such as installs and such are much faster. It also makes for a very quite machine, and one that I don't have to worry about moving around. The higher resolution screen is very nice. I can get a lot more on the screen, but it also just makes images look nicer.

As far as glossy/non-glossy goes, this is a complete scam. The non-glossy screen is beautiful, hardly distinguishable from one of the glossy screens, except for the fact that I don't have to try to see past my own reflection in order to read the screen. I guess I do miss out on being able to see the aliens sneaking by behind me while I'm working, though. I really think this whole glossy screen business is a way to make the displays cheaper, and a bunch of marketing spin to convince people that the inferior screen is actually an advantage.

The keyboard on the machine has a fairly nice feel. I was a little concerned about the caps lock key (which I configured to be a control key), since my BT flat keyboard has problems missing that key. The keyboard on the MBP has a much smoother feel than the BT keyboard, and I really haven't had any problems with it. It's definitely better than the keyboard on my 2008 (with Santa Rosa). The keyboard is the one thing I miss from the ThinkPad, though.

Setup

Once I got networking, I did some significant setup. I booted the install DVD, and shrank the OSX partition by about 32GB to make room for a native Linux partition. I used parted to make a swap and root filesystem for Linux, and then booted into rEFIt and used it's partition fixer to fix the legacy partition so that the legacy bootloader could easily boot Linux. I installed Arch Linux without any real hitches, using the wired ethernet. The built-in wireless ethernet seems to be a very new device, and even the Broadcom driver didn't seem to work (it talks to it, but won't authenticate to the base station). I'm sure this will get working at some point, and it probably won't be long before I can use the b43 driver in the kernel.

I installed Virtualbox, and also put Arch Linux in that. It's a bit slower, but does integrate nicely with the OSX environment. That will probably be the normal environment I use Linux from on the machine.

As far as OSX software goes, I installed MacPorts to be able to easily build packages. I found a version of mplayer-enhanced which seems to be mostly as good as the Linux version. It's the only program I can find on the machine that will play 10Mb/s H.264 streams without glitching.

All in all, I'm quite happy with the new machine.

Loading mentions Retweet

Filed under // macbook

Comments (0)

Apr 30 / 2:04am

Jpool updated to Scala 2.8

I have finished updating jpool to Scala 2.8 (currently 2.8.0.RC1). I will leave this development on the 'try-2.8' branch until 2.8.0 is released. However, now that this seems to be working fully, I will not be backporting changes to the old branch.

I ran into some interesting problems with the conversion. The most tedious to fix was that Scala 2.8 doesn't auto import parent packages from an import statement. Once I figured out the correct way of handling this, things got a lot better. Basically:

   package org.davidb.jpool.tools

can be changed to

    package org.davidb.jpool
    package tools

and this will auto-import both 'org.davidb.jpool' as well as the tools packages.

The other main effort was because of the conversion of the containers. Since jpool creates several of it's own containers, these had to be updated to use the new naming system. Stacks have been fixed to actually be implemented as stacks, which ended up simplifying some of the code that used them.

Previously, I was using streams to iterate the directory trees in the filesystem. Moving to 2.8 provoked some space leaks in my code. However, the new collection classes make Iterator as convenient to use as streams, while nicely maintaining the overwriting behaviour of the iterator. I converted all of the Streams into Iterators and eliminated the space leaks. This seems similar to Clojure, and streams are rather hard to not have space leaks on the JVM, since it doesn't seem very good about determining lifetime of locals.

Beyond this, I've now implemented a 'clone' tool. This tool individual snapshots to be migrated from one pool to another. This can be used as a kind of poor-man's garbage collection. I've been using this to make weekly pool snapshots, which helps keep this weekly pool smaller, since it has fewer snapshots in it.

Loading mentions Retweet

Filed under // jpool scala

Comments (0)

Oct 18 / 4:25am

Asure 1.00 released

I have released version 1.00 of my Asure file integrity program. This is a small python program that captures file hashes and permissions over a directory tree, and can be used to either look for changes (similar to tripwire), or verify that the files are properly restored when testing backups. It is primarily intended for the later. It has an important command update which only rehashes files that have been touched since the last run, which allows the database to be kept update to date fairly quickly.

This release mostly fixes a warning with newer versions of Python, and has a proper setup.py to make it easier to package. The page above also has a link to an Arch Linux User Repository (AUR) package that allows it to easily be installed on Arch Linux.

Loading mentions Retweet

Filed under // asure

Comments (0)

Oct 5 / 5:56am

Arch Linux

This weekend, I did a couple of installs of Arch Linux. So far, it seems like it is going to be a pretty good fit for what I want to do.

I had several frustrations with any of the Debian-based distributions: mostly that they have definitive releases, and getting recent versions of packages takes a long time. Even Gentoo is starting to get slow about releases, but the overlay system helps for cutting edge things. I've also found that both of these make it somewhat awkward and difficult to package up my own things.

This is where Arch really shines. The arch pacakge manager is a binary package manager, similar to dpkg, but a bit simpler. It manages dependencies and upgrades, and tracking files, although it just drops config files into /etc/filename.pacnew and lets the user manage the updates. Gentoo used to be similar, but several tools now help manage these, and I suspect could be adapted/written for Arch.

But, building Arch packages is really easy. The abs tool will synchronize all of the package descriptors for all of the arch packages that are “trusted”. These can easily be copied somewhere and the package rebuilt, similar to BSD ports. However, the result is a binary package that can be managed by pacman. The Arch User Repository holds packages uploaded by arbitrary users. It allows commentary, and voting, and well-done packages can be promoted into the regular Arch distribution.

I built my first package, of Aegis, which has it's very own package page.

The other thing that is nice about Arch is that the config files (in /etc) are much simpler than most distributions, more like BSD scripts than a typical Linux machine. Most stuff is a bunch of shell variables set in /etc/rc.conf, with a handfull of other things in other files. The installer just puts you in an editor with these files, and it is fairly easy to figure out.

What will be really interesting to see is how well it handles upgrades as time goes on, since this is the difficulty of any distro that does incremental upgrades.

Loading mentions Retweet

Comments (0)

Aug 22 / 3:03pm

NX and MacOS

My primary home machine is a Mac Pro desktop. Although the machine dual boots between MacOS X and Linux, it is rather inconvenient to do so. I have been running Linux in a VM, but find that the performance isn't all that great.

I decided to give No Machine's NX system a try. I decided to give FreeNX a try, mostly because it is GPL, and the source is available. It was fairly easy to install on Gentoo, just:

   sudo emerge nxserver-freenx
and wait a short while.

I downloaded the MacOS NX client from No Machine, since there doesn't appear to be a free version. This client is ppc only, and feels very much like a non-native Mac App. Fortunately, once you are past login, the application comes across as just a single large window.

Unfortuantely, it isn't a native MacOS app, but an X11 app. The first thing I discovered is that the keyboard layout is abysmal. After playing around with 'xev' and 'xmodmap', I came up with the following xmodmap config to make the keys better

keycode 66 = Alt_L
keycode 63 = Super_L
keycode 71 = Super_R
keycode 69 = Alt_R

clear Mod1
clear Mod4
add Mod1 = Alt_L Alt_R
add Mod4 = Super_L Super_R
NX seems to update the keymap on the remote X server to match the current client, so it doesn't seem to be a problem switching between clients.

Although I tell the NX client to make the window the largest size, I still seem to have to click on the green “maximize“ button to make it fill the screen. There's still the Mac menu bar at the top, and a window border below that, and it's be really nice to get full screen to work, but this is quite usable now.

I tried using the connection over a wired LAN, WiFi, as well as an EvDO modem. All of these configurations are quite usable. With the local networks, videos play, although I don't have any audio (the server doesn't have speakers, and I suspect it is going that route).

The last thing I did was to update the keypair used to authenticate the NX ssh login. NX doesn't listen on a port, but uses ssh to connect to the server, always logging in as the 'nx' user. It ships with a keypair that allows the client to connect without any configuration, however, this now relies on the NX password authentication. Fortunately, simply running

sudo nxkeygen
generated a new keypair. I then looked at the file '/var/lib/nxserver/home/.ssh/client.id_dsa.key' and pasted the contents into the keypair in the NX client configuration. This matches my security model better, since I normally don't allow password logins on my machines.

I'll give this setup a try for a while, hopefully it will require me to 'unison' synchronize my data quite as much between so many different machines.

Update: Making clipboard sync work

Getting the clipboard to sync between MacOS and NX was challenging. NX had no problem with the sync, but Apple's X11 doesn't enable it by default.

To set this, completely exit the X11 program, and using a Terminal window, cd to ~/Library/Preferences and 'open org.x.X11.plist. Using the editor, change:


Field Value
enable_key_equivalents false
sync_clipboard_to_pasteboard true
sync_pasteboard true
sync_pasteboard_to_clipboard true
sync_pasteboard_to_primary true
sync_primary_on_select true

It's probably possible to use other settings, but I was able to make this combination work. None of it seems to work if the key equivalents is not disabled, which means you can't use the Apple key shortcuts.
Loading mentions Retweet

Comments (0)

Jul 23 / 9:24am

Piano recital

Thanks to some help from my newphew (thanks Dustin), I was able to get video and audio recordings of my recent piano recital. The audio was recorded with 2 Sure KSM-32 microphones, one for the piano, and one for the drums, and another mic of an unknown type in front of the bass player's amp. I used a Motu 828mkII as an audio input device, (post mixer) and recorded the audio with Boom Recorder.

Audio editing was done with Logic Pro and the videos with Final Cut Pro.

First is I'll remember April:

And second is You Own Sweet Way:

Loading mentions Retweet

Filed under // jazz piano

Comments (0)

Jul 10 / 6:44am

ICFP, well I guess not.

A few weekends ago, I got myself prepared to participate in the ICFP programming contest. I did it a few years back, and although it was a lot of work, it was fun. This year, I just couldn't motivate myself to even start. The only part of the problem that seemed interesting to me was the virtual machine itself, and even then, I couldn't motivate myself to do something so transitory.

So, instead, I started working on the Project Euler problems, in Haskell. I've been pushing my solutions to Github, but I don't recommend looking if you consider looking at the problems.

This has renewed my interest/fascination with Haskell, and I've since dug up my Haskell implementation of the backup software “harchive”. The code has suffered some bitrot in the few years and doesn't build any more, mostly a consequence of libraries I depend upon.

I'm currently working on implementing the new HashMap I came up with for Jpool. Ideally, I will have more than one implementation of this software that uses a compatible storage format. I enjoy programming in Haskell, but find that it also stretches my thinking a lot.

What's also been taking up a good bit of my time is practicing for a piano recital coming up on the 18th. I'll be doing two jazz songs with a trio, and we hope to get videos up on YouTube afterwards.

Loading mentions Retweet

Comments (0)

Jun 13 / 5:34am

Verifying Backups

How do I know if my backups are working? Ideally, you should test a full restore of a system to make sure that every part of the process works. I do this, but I don't think very many people do. Even just running a test restore onto a temporary directory or partition can tell you quite a bit. But still, how do I know if what I've restored is the same as what I backed up?

One way to do this is to compare the trees. A simple way to do this is:

        tar -cf - -C /dir1 . | tar -df - -C /dir2
which will compare the contents of dir1 and dir2. The biggest problem with this approach is that the older the backup you are testing is, the more likely it is that the filesystem has changed since running the backup.

Another approach, is to compute hashes of the files, and check them:

        find . -type f | xargs sha1sum > SHA1SUMS
and back the SHA1SUMS file up with the backup. Upon restore you can use the '-c' option to the sha1sum program to test the hashes. This works a lot better, but really only verifies the integrity of the contents, not of metadata.

There are programs, most famously Tripwire that do a better job of managing this integrity. As far as I can tell, nobody uses this to verify backups. All of them seem to want absolute paths, and have no real obvious way of verifying the integrity of a backup restored into a temporary directory.

A deeper problem, however, is that all of these scans involve computing the hashes of all of the files in the filesystem, for each backup. With good incremental or snapshot backups, this integrity scan can easily take longer than the whole backup itself.

Since I couldn't find anything that did what I wanted, I wrote my own program Asure. It's a fairly small Python program that manages integrity snapshots of trees of files. It's update command has the useful feature that if a file's timestamps haven't changed since the last scan, the hash will not be recomputed. This does miss modifications to the file caused by underlying hardware problems, or subversions of the operating system, but tends to make the scan fast enough that it remains useful.

I do feel it is important that this scan utility be a completely separate codebase than the backup software being used. It's tempting to use the file scanning libraries from jpool, since they do largely the same thing. However, much of what I'm trying to detect here is to detect bugs in jpool. Hopefully bugs in the Python code and bugs in the Scala code are less likely to happen in the same place.

Loading mentions Retweet

Comments (0)