Rust-1.0 Alpha2 bench

This is a follow-up of luajit’s benchmark. We use 512 bytes as the key to dedup a big random binary file. I wrote the same benchmark in Rust to get an idea how performant it is. Sadly, with the Rust-1.0 alpha2, its std::old_io runtime library is so terrible in performance. With a naive loop of IO read it would take around 20 seconds on my machine. Even worse then Nodejs’s figures. I then decided to skip the runtime library and implemented it in libc FFI, and get the figures much closer to what was expected, but it is still slower then C++’s naive implementation on Mac OS X.

Here is the time result

➜  rust git:(master) rustc --version
rustc 1.0.0-nightly (522d09dfe 2015-02-19) (built 2015-02-20)
➜  rust git:(master) time ./target/uniq-blocks
./target/uniq-blocks  2.95s user 1.58s system 99% cpu 4.560 total

And code.

#![feature(libc)] extern crate libc;
use libc::funcs::posix88::unistd::read;
use libc::funcs::posix88::fcntl::open;
use libc::consts::os::posix88::O_RDONLY;
use libc::consts::os::posix88::S_IREAD;
use libc::types::common::c95::c_void;
use std::ffi::CString;
use std::str;
use std::collections::HashMap;

fn main() {
    let path_str = CString::new("large_random.txt").unwrap();
    let fd = unsafe {
        open(path_str.as_ptr(), O_RDONLY, S_IREAD)
    };

    let mut buf: Vec<u8> = Vec::with_capacity(512);
    let mut M: HashMap<String, u32> = HashMap::new();

    loop {
        let nread = unsafe { read(fd, buf.as_mut_ptr() as *mut c_void, 512) };
        let s = str::from_utf8(buf.as_slice()).ok().expect("from_utf8 error");
        let ss = String::from_str(s);
        if M.contains_key(&ss) {
            match M.get_mut(&ss) {
                Some(x) => { *x += 1 },
                None => ()
            };
        } else {
            M.insert(ss, 1);
        }

        if nread == 0 {
            break;
        };
    }
}