Reading a Matlab Matrix in c++ 2015-02-16

In doing a performance comparison between several linear algebra libraries I had to read in several large (more than 21 million non zero values) sparse matrices. I’m not going to claim that this is the fastest way to read in a matrix that is stored on disk, but for me it was fast enough.

The Data Structure

This struct contains three std::vectors which store the row, column and value entries from each line in the file. Some assumptions are made on the matrix, namely that there are no rows will all zero entries and that the lass column with data is the last column in the matrix. If your matrix is larger than this then you will need to manually modify the data structure that you store your matrix into. The matlab ascii sparse matrix format does not store the number of rows and columns Reference.

#include <iostream>
#include <vector>
#include <algorithm>
struct COO {
  std::vector<size_t> row;     // Row entries for matrix
  std::vector<size_t> col;     // Column entries for matrix
  std::vector<double> val;     // Values for the non zero entries
  unsigned int num_rows;       // Number of Rows
  unsigned int num_cols;       // Number of Columns
  unsigned int num_nonzero;    // Number of non zeros
  // Once the data has been read in, compute the number of rows, columns, and nonzeros
  void update() {
    num_rows = row.back();
    num_cols = *std::max_element(col.begin(), col.end());
    num_nonzero = val.size();
    std::cout << "COO Updated: [Rows, Columns, Non Zeros] [" << num_rows << ", " << num_cols << ", " << num_nonzero << "] " << std::endl;
  }
};

Updating python eggs using pip and easy_install 2015-02-01

I use buildbot to manage our labs build/testing infrastructure. I wrote up a guide a while back on how to set it up on different platforms. In this post I wanted to document how to keep the setup updated.

Note: If using a sandbox first source that sandbox

source sandbox/bin/activate

Update using pip

Using a shell command

pip freeze --local | grep -v '^\-e' | cut -d = -f 1  | xargs pip install -U

Using a python file

import pip
from subprocess import call

for dist in pip.get_installed_distributions():
    call("pip install --upgrade " + dist.project_name, shell=True)

Make vs Ninja Performance Comparison 2015-01-31

Ever since I started using CMake to handle generating my build files I have relied on Makefiles. Most linux distributions come with the make command so getting up and running doesn’t require too much effort. Make and its derivatives been around for almost 40 years and it’s an extremely powerful tool that can do many things beyond simply compiling code. There are cases where the flexibility and power of make are overkill in terms of compiling code and if you are willing to trade them with improved performance Ninja might be what you are looking for.

Ninja, written by Evan Martin is a build system that is focused on performance. It was designed for fast incremental builds and large projects in general. To quote the chromium project “Ninja is a build system written with the specific goal of improving the edit-compile cycle time”.

Ninja does not output information about the current progress of the build on more than one line. Warnings and Errors are output like normal. Make on the other hand will output a line for every single cpp file that was compiled and linked. So beyond the performance improvements Ninja has a higher signal to noise ratio than Make.

OpenCL Kernel Setup 2015-01-06

Most of the content in this post used to be a part of another post. I felt that it was important enough to have its own post so I moved it and made some minor changes to the content.

For a working example please see the following github repository

Compiling PhysBAM using clang 2015-01-05

The purpose of this post is to document the process I went through to compile the public PhysBAM library using clang. With clang support XCode should also compile the code properly.

Setup

Repository:  https://github.com/hmazhar/physbam_public

OS:  OSX 10.10.1

Compiler:
Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin14.0.0
Thread model: posix

Overview

Most of the compilation errors were not unique and came up quite often. The most prevalent errors were related to the following.

  • Missing functions - Usually defined after use, Fix is to move them before use or create prototypes at the top of the file
  • Two Phase name lookup - This error manifests itself as the following:
error: call to function 'Foo'
      that is neither visible in the template definition nor found by argument-dependent lookup

The problem is caused by the function, in this case “Foo” being defined in the base class of the templated child class. There are two ways to fix this error, first as documented here is to add a this-> before any functions that are defined in the base class. The alternative is to explicitly scope the function with the child class. Both seem to work just fine, I thought that the first method, using this->, is a little bit cleaner.

Other errors exist but were not common, they will be covered here on a case by case basis.