diff --git a/CMakeModules/Version.cmake b/CMakeModules/Version.cmake
index a4ea1b3560..bcc7409fd8 100644
--- a/CMakeModules/Version.cmake
+++ b/CMakeModules/Version.cmake
@@ -10,7 +10,7 @@ ENDIF()
 
 SET(AF_VERSION_MAJOR "3")
 SET(AF_VERSION_MINOR "4")
-SET(AF_VERSION_PATCH "0")
+SET(AF_VERSION_PATCH "1")
 
 SET(AF_VERSION "${AF_VERSION_MAJOR}.${AF_VERSION_MINOR}.${AF_VERSION_PATCH}")
 SET(AF_API_VERSION_CURRENT ${AF_VERSION_MAJOR}${AF_VERSION_MINOR})
diff --git a/README.md b/README.md
index 07bf01a47c..c0d12b81fd 100644
--- a/README.md
+++ b/README.md
@@ -1,35 +1,35 @@
 <a href="http://arrayfire.com/"><img src="http://arrayfire.com/logos/arrayfire_logo_whitebkgnd.png" width="300"></a>
 
-ArrayFire is a general-purpose library that simplifies the process of developing 
-software that targets parallel and massively-parallel architectures including 
+ArrayFire is a general-purpose library that simplifies the process of developing
+software that targets parallel and massively-parallel architectures including
 CPUs, GPUs, and other hardware acceleration devices.
 
-To achieve this goal, ArrayFire provides software developers with a high-level 
-abstraction of data which resides on the accelerator, the `af::array` object 
+To achieve this goal, ArrayFire provides software developers with a high-level
+abstraction of data which resides on the accelerator, the `af::array` object
 (or C-style struct).
 Developers write code which performs operations on ArrayFire arrays which, in turn,
 are automatically translated into near-optimal kernels that execute on the computational
-device. 
-ArrayFire is successfully used on devices ranging from low-power mobile phones to 
-high-power GPU-enabled supercomputers including CPUs from all major vendors (Intel, AMD, Arm), 
-GPUs from the dominant manufacturers (NVIDIA, AMD, and Qualcomm), as well as a variety 
+device.
+ArrayFire is successfully used on devices ranging from low-power mobile phones to
+high-power GPU-enabled supercomputers including CPUs from all major vendors (Intel, AMD, Arm),
+GPUs from the dominant manufacturers (NVIDIA, AMD, and Qualcomm), as well as a variety
 of other accelerator devices on Windows, Mac, and Linux.
 
 Several of ArrayFire's benefits include:
 
-* [Easy to use](http://arrayfire.org/docs/gettingstarted.htm), stable, 
+* [Easy to use](http://arrayfire.org/docs/gettingstarted.htm), stable,
   [well-documented](http://arrayfire.org/docs) API.
-* Rigorously Tested for Performance and Accuracy 
+* Rigorously Tested for Performance and Accuracy
 * Commercially Friendly Open-Source Licensing
 * Commercial support from [ArrayFire](http://arrayfire.com)
 * [Read about more benefits on Arrayfire.com](http://arrayfire.com/the-arrayfire-library/)
- 
+
 ### Build and Test Status
 
-|         | Linux x86_64 | Linux armv7l | Linux aarch64 | Windows | OSX |
-|:-------:|:------------:|:------------:|:-------------:|:-------:|:---:|
-| Build   | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-linux/build/devel)](http://ci.arrayfire.org/job/arrayfire-linux/job/build/branch/devel/) | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-tegrak1/build/devel)](http://ci.arrayfire.org/job/arrayfire-tegrak1/job/build/branch/devel/) | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-tegrax1/build/devel)](http://ci.arrayfire.org/job/arrayfire-tegrax1/job/build/branch/devel/) | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-windows/build/devel)](http://ci.arrayfire.org/job/arrayfire-windows/job/build/branch/devel/) | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-osx/build/devel)](http://ci.arrayfire.org/job/arrayfire-osx/job/build/branch/devel/) |
-| Test    | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-linux/test/devel)](http://ci.arrayfire.org/job/arrayfire-linux/job/test/branch/devel/) | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-tegrak1/test/devel)](http://ci.arrayfire.org/job/arrayfire-tegrak1/job/test/branch/devel/) | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-tegrax1/test/devel)](http://ci.arrayfire.org/job/arrayfire-tegrax1/job/test/branch/devel/) | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-windows/test/devel)](http://ci.arrayfire.org/job/arrayfire-windows/job/test/branch/devel/) | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-osx/test/devel)](http://ci.arrayfire.org/job/arrayfire-osx/job/test/branch/devel/) |
+|         | Linux x86_64 | Linux aarch64 | Windows | OSX |
+|:-------:|:------------:|:-------------:|:-------:|:---:|
+| Build   | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-linux/build/devel)](http://ci.arrayfire.org/job/arrayfire-linux/job/build/branch/devel/) | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-tegrax1/build/devel)](http://ci.arrayfire.org/job/arrayfire-tegrax1/job/build/branch/devel/) | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-windows/build/devel)](http://ci.arrayfire.org/job/arrayfire-windows/job/build/branch/devel/) | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-osx/build/devel)](http://ci.arrayfire.org/job/arrayfire-osx/job/build/branch/devel/) |
+| Test    | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-linux/test/devel)](http://ci.arrayfire.org/job/arrayfire-linux/job/test/branch/devel/) | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-tegrax1/test/devel)](http://ci.arrayfire.org/job/arrayfire-tegrax1/job/test/branch/devel/) | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-windows/test/devel)](http://ci.arrayfire.org/job/arrayfire-windows/job/test/branch/devel/) | [![Build Status](http://ci.arrayfire.org/buildStatus/icon?job=arrayfire-osx/test/devel)](http://ci.arrayfire.org/job/arrayfire-osx/job/test/branch/devel/) |
 
 ### Installation
 
@@ -143,7 +143,7 @@ details.
 
 ### Trademark Policy
 
-The literal mark “ArrayFire” and ArrayFire logos are trademarks of 
+The literal mark “ArrayFire” and ArrayFire logos are trademarks of
 AccelerEyes LLC DBA ArrayFire.
 If you wish to use either of these marks in your own project, please consult
 [ArrayFire's Trademark Policy](http://arrayfire.com/trademark-policy/)
diff --git a/docs/pages/release_notes.md b/docs/pages/release_notes.md
index bdcd91158a..b944a0c9c1 100644
--- a/docs/pages/release_notes.md
+++ b/docs/pages/release_notes.md
@@ -1,6 +1,90 @@
 Release Notes {#releasenotes}
 ==============
 
+v3.4.1
+==============
+
+Installers
+----------
+* Installers for Linux, OS X and Windows
+  * CUDA backend now uses [CUDA 8.0](https://developer.nvidia.com/cuda-toolkit).
+  * Uses [Intel MKL 2017](https://software.intel.com/en-us/intel-mkl).
+  * CUDA Compute 2.x (Fermi) is no longer compiled into the library.
+* Installer for OS X
+  * The libraries shipping in the OS X Installer are now compiled with Apple
+    Clang v7.3.1 (previouly v6.1.0).
+  * The OS X version used is 10.11.6 (previously 10.10.5).
+* Installer for Jetson TX1 / Tegra X1
+  * Requires [JetPack for L4T 2.3](https://developer.nvidia.com/embedded/jetpack)
+    (containing Linux for Tegra r24.2 for TX1).
+  * CUDA backend now uses [CUDA 8.0](https://developer.nvidia.com/cuda-toolkit) 64-bit.
+  * Using CUDA's cusolver instead of CPU fallback.
+  * Uses OpenBLAS for CPU BLAS.
+  * All ArrayFire libraries are now 64-bit.
+
+Improvements
+------------
+* Add [sparse array](\ref sparse_func) support to \ref af::eval().
+  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1598)</sup>
+* Add OpenCL-CPU fallback support for sparse \ref af::matmul() when running on
+  a unified memory device. Uses MKL Sparse BLAS.
+* When using CUDA libdevice, pick the correct compute version based on device.
+  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1612)</sup>
+* OpenCL FFT now also supports prime factors 7, 11 and 13.
+  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1383)</sup>
+  <sup>[2](https://github.com/arrayfire/arrayfire/pull/1619)</sup>
+
+Bug Fixes
+---------
+* Allow CUDA libdevice to be detected from custom directory.
+* Fix `aarch64` detection on Jetson TX1 64-bit OS.
+  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1593)</sup>
+* Add missing definition of `af_set_fft_plan_cache_size` in unified backend.
+  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1591)</sup>
+* Fix intial values for \ref af::min() and \ref af::max() operations.
+  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1594)</sup>
+  <sup>[2](https://github.com/arrayfire/arrayfire/pull/1595)</sup>
+* Fix distance calculation in \ref af::nearestNeighbour for CUDA and OpenCL backend.
+  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1596)</sup>
+  <sup>[2](https://github.com/arrayfire/arrayfire/pull/1595)</sup>
+* Fix OpenCL bug where scalars where are passed incorrectly to compile options.
+  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1595)</sup>
+* Fix bug in \ref af::Window::surface() with respect to dimensions and ranges.
+  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1604)</sup>
+* Fix possible double free corruption in \ref af_assign_seq().
+  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1605)</sup>
+* Add missing eval for key in \ref af::scanByKey in CPU backend.
+  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1605)</sup>
+* Fixed creation of sparse values array using \ref AF_STORAGE_COO.
+  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1620)</sup>
+  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1621)</sup>
+
+Examples
+--------
+* Add a [Conjugate Gradient solver example](\ref benchmarks/cg.cpp)
+  to demonstrate sparse and dense matrix operations.
+  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1599)</sup>
+
+CUDA Backend
+------------
+* When using [CUDA 8.0](https://developer.nvidia.com/cuda-toolkit),
+  compute 2.x are no longer in default compute list.
+  * This follows [CUDA 8.0](https://developer.nvidia.com/cuda-toolkit)
+    deprecating computes 2.x.
+  * Default computes for CUDA 8.0 will be 30, 50, 60.
+* When using CUDA pre-8.0, the default selection remains 20, 30, 50.
+* CUDA backend now uses `-arch=sm_30` for PTX compilation as default.
+  * Unless compute 2.0 is enabled.
+
+Known Issues
+------------
+* \ref af::lu() on CPU is known to give incorrect results when built run on
+  OS X 10.11 or 10.12 and compiled with Accelerate Framework.
+  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1617)</sup>
+  * Since the OS X Installer libraries uses MKL rather than Accelerate
+    Framework, this issue does not affect those libraries.
+
+
 v3.4.0
 ==============