Brett Klamer

Table of Contents

Faster BLAS in R

R uses BLAS (Basic Linear Algebra Subprograms) and LAPACK (Linear Algebra PACKage) libraries for linear algebra operations. Because the default libraries have not been fully optimized, you could see large increases in speed, for code that is dependent on linear algebra computations, by switching to a different BLAS library. The traditional open source solutions for this have been ATLAS, GotoBLAS (no longer developed), and more recently OpenBLAS (GotoBLAS2 fork).

The Intel MKL (Math Kernel Library) offers a closed source alternative to the options above. The Intel MKL is nice, as it’s designed for Intel processors by Intel, but it has its own license terms and may not have as great performance with AMD processors.

Normally you would compile R and the BLAS libraries from source for the best optimization. However, the process for doing this is dynamic and can result in unexpected behavior. Even using reference BLAS libraries outside of R’s default may not be the best (see this). With that in mind, the following should provide an easy and conservative approach to replacing the standard BLAS.

Faster BLAS in R for Ubuntu

In Ubuntu, you have two free and open source solutions, ATLAS and OpenBLAS. To install ATLAS and OpenBLAS, use

# install OpenBLAS
sudo apt-get install libopenblas-base
# install ATLAS
sudo apt-get install libatlas3-base liblapack3

Fortunately, if you install both, you can easily switch between them using

sudo update-alternatives --config libblas.so.3-x86_64-linux-gnu

which should show

@ubuntu:~$ sudo update-alternatives --config libblas.so.3-x86_64-linux-gnu
There are 3 choices for the alternative libblas.so.3-x86_64-linux-gnu (providing /usr/lib/x86_64-linux-gnu/libblas.so.3).

  Selection    Path                                                     Priority   Status
------------------------------------------------------------
* 0            /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3   100       auto mode
  1            /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3              35        manual mode
  2            /usr/lib/x86_64-linux-gnu/blas/libblas.so.3               10        manual mode
  3            /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3   100       manual mode

Press <enter> to keep the current choice[*], or type selection number:

Then make sure to choose the corresponding LAPACK. Run

sudo update-alternatives --config liblapack.so.3-x86_64-linux-gnu

which should show

@ubuntu:~$ sudo update-alternatives --config liblapack.so.3-x86_64-linux-gnu
There are 3 choices for the alternative liblapack.so.3-x86_64-linux-gnu (providing /usr/lib/x86_64-linux-gnu/liblapack.so.3).

  Selection    Path                                                       Priority   Status
------------------------------------------------------------
* 0            /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3   100       auto mode
  1            /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3              35        manual mode
  2            /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3             10        manual mode
  3            /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3   100       manual mode

Press <enter> to keep the current choice[*], or type selection number: 

To check the currently used LAPACK and BLAS within an R session, run

# These functions will not work on windows OS
La_library()
extSoftVersion()["BLAS"]

To complicate things, there are actually three different versions of OpenBLAS: serial, OpenMP, and pthread (POSIX threads). OpenMP and pthreads allow for parallel processing. It appears that libopenblas-base installs pthreads by default. Consensus seems to favor OpenMP?

For additional OpenBLAS support in Stan (CmdStan and CmdStanR), see here https://discourse.mc-stan.org/t/speedup-by-using-external-blas-lapack-with-cmdstan-and-cmdstanr-py/25441

To use Intel’s MKL, you can either

Make sure the following environment variable is added: MKL_THREADING_LAYER=GNU. Otherwise you will experience numerical errors using intel’s MKL.

Faster BLAS in R for Windows

It’s a bit tougher for Windows. I don’t know of any current precompiled ATLAS or OpenBLAS files that can be dropped in. There are two sources of older optimized BLAS files, here and here, but it’s probably not something you want to rely on. Even worse, Microsoft R Open no longer exists as of June 2021. The current best methods may be to install Intel’s MKL manually, or copy the Intel MKL libraries from an old MRO installation. MRO installs the MKL files at

C:\Program Files\Microsoft\R Open\R-4.0.2\bin\x64\libiomp5md.dll
C:\Program Files\Microsoft\R Open\R-4.0.2\bin\x64\Rblas.dll
C:\Program Files\Microsoft\R Open\R-4.0.2\bin\x64\Rlapack.dll

and these three files can be dropped in and replace the Rblas.dll and Rlapack.dll files for the open source R at

C:\Program Files\R\R-4.0.2\bin\x64\

There’s also a few R packages specific to MRO (see Revo*) that may also be transferred over, though I haven’t tested this.

But, this could be troublesome in a corporate environment:

Microsoft R Services MKL
End User License Agreement

R Services MKL is a software package provided for use with Microsoft R Server, 
Microsoft R Open, and any successors or other software applications released by 
us ("Licensed Products") that can be used with the Intel® Math Kernel Libraries 
("MKL")(for more information, see https://software.intel.com/en-us/intel-mkl).  
This software package is referred to herein as the "R Services MKL."  
Intel Corporation is referred to as "Intel."

1. License.  The R Services MKL is licensed for exclusive use with the Licensed 
Products. Subject to all of the terms and conditions of this Agreement, Microsoft 
grants User a non-exclusive, non-transferable, non-sublicensable right to use R 
Services MKL. R Services MKL must be used as a single integrated software 
application and may not be separated into its constituent parts. 

Benchmarks

Some artificial benchmarks using the following scripts:

This one is also interesting (from here).

Timings below are rough averages across a handful of runs. Nothing exact or rigorous. I’d ignore any differences within 10% or so.

R benchmark 25

Revo Script Matrix Multiply

Revo Script Cholesky

Revo Script SVD

Revo Script PCA

Revo Script LDA

Published: 2014-10-23
Last Updated: 2021-09-12