Commit 14668a45 authored by DannyRuijters's avatar DannyRuijters
Browse files

Initial commit

Initial commit of the pre-filtered cubic b-spline interpolation CUDA
code
parent daa00b1f
Copyright (c) 2015, DannyRuijters
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# CubicInterpolationCUDA
GPU accelerated pre-filtered cubic b-spline interpolation using CUDA
@echo This example video needs the divx codec to be installed!!!
@cudaCubicAviPlayback.exe ../../../examples/data/jynx_divx.avi
@pause
@cudaCubicRayCast.exe ../../../examples/data/bucky.raw 32 32 32
@pause
@cudaCubicTexture3D.exe ../../../examples/data/bucky.raw 32 32 32
@pause
@glCubicRayCast.exe ../../../examples/data/bucky.raw 32 32 32
@pause
@referenceCubicTexture3D.exe ../../../examples/data/bucky.raw 32 32 32
@pause
2013-05-22 | version 1.2 | Author: Danny Ruijters, Luca Caucci
_______________________________________________
* Added support for layered textures
* Added makefiles for CUDA5 in the samples
2012-10-31 | version 1.1 | Author: Danny Ruijters
_______________________________________________
* Added 1D cubic interpolation routines
* Several small fixes
* Added Accuracy test sample for prefiltered 1D, 2D and 3D textures
2010-10-05 | version 1.0 | Author: Danny Ruijters
_______________________________________________
* The prefilter is now initialized for clamping (instead of mirroring) boundary conditions, which is consistent with CUDA texture lookups.
* Changed the prefilter and memcpu.cu functions to use pitched pointers in order to cope with non-power-of-two data.
* All example programs now use this pitched pointer.
2010-05-24 | version 0.9 | Author: Tamas Farago
_______________________________________________
* Small bug fix in the "SamplesToCoefficients3DZ" function.
2010-04-29 | version 0.8 | Author: Danny Ruijters
_________________________________________________
* Added 1st order b-spline derivatives and interpolation functions, called "cubicTex3D_1st_derivative_x", etc.
* Renamed the cubic interpolation include files to "cubicTex3D.cu" and "cubicTex2D.cu".
* Reorganized the "code" folder.* Added copy volume to texture function "CreateTextureFromVolume" to memcpy.cu.
* Fixed some minor bugs in the cubic ray casting examples.
2010-04-08 | version 0.7 | Author: Danny Ruijters
_________________________________________________
* Added support for vector data formats (such as RGBA).
* Added avi sample to illustrate the use of the vector formats.
* Renamed the cubic texture lookup functions to "cubicTex2D" and "cubicTex3D".
* Made the names of the examples more consistant.
* Added Makefiles to the cudaCubicRotate2D, cudaCubicTexture3D, cudaCubicRayCast, and glCubicRayCast examples for compilation on the mac and linux.
* Added readme.txt in examples to explain how to compile with CUDA SDK 3.0
2009-12-28 | version 0.6 | Author: Danny Ruijters
_________________________________________________
* Added OpenGL ray-casting example "glCubicRayCast" using tri-cubic interpolation, without using CUDA.
2009-11-24 | version 0.5 | Author: Danny Ruijters
_________________________________________________
* Small fix in template function in memcpy.cu to make it also compile with CUDA 2.3.
* Changed all constants from double to float notation.
2009-02-20 | version 0.4 | Author: Danny Ruijters
_________________________________________________
* Small fix in template call in memcpy.cu to make it also compile with CUDA 2.1.
2009-01-25 | version 0.3 | Author: Danny Ruijters
_________________________________________________
* Small bugfix in the SSE code in examples/referenceCubicTexture3D, provided by Alfred R. Fuller.
* Updated the internet address, since the webpage has moved.
* Corrected the name of the CastVolumeHostToDevice function in memcpy.cu
2008-10-12 | version 0.2 | Author: Danny Ruijters
_________________________________________________
* Added SSE and multi-threading support to the reference application, in order to obtain fair performance comparisons.
* Fixed the internet address in the text at the top in the source files.
* Added macros for enabling / disabling display output in the simpleCubicTexture3D and referenceCubicTexture3D examples, in order to profile the interpolation performance.
2008-10-06 | version 0.1 | Author: Danny Ruijters
_________________________________________________
* Initial version
/*--------------------------------------------------------------------------*\
Copyright (c) 2008-2010, Danny Ruijters. All rights reserved.
http://www.dannyruijters.nl/cubicinterpolation/
This file is part of CUDA Cubic B-Spline Interpolation (CI).
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of the copyright holders nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.
The views and conclusions contained in the software and documentation are
those of the authors and should not be interpreted as representing official
policies, either expressed or implied.
When using this code in a scientific project, please cite one or all of the
following papers:
* Daniel Ruijters and Philippe Thévenaz,
GPU Prefilter for Accurate Cubic B-Spline Interpolation,
The Computer Journal, vol. 55, no. 1, pp. 15-20, January 2012.
http://dannyruijters.nl/docs/cudaPrefilter3.pdf
* Daniel Ruijters, Bart M. ter Haar Romeny, and Paul Suetens,
Efficient GPU-Based Texture Interpolation using Uniform B-Splines,
Journal of Graphics Tools, vol. 13, no. 4, pp. 61-69, 2008.
\*--------------------------------------------------------------------------*/
#ifndef _CAST_FLOAT4_H_
#define _CAST_FLOAT4_H_
#include "memcpy.cu"
//--------------------------------------------------------------------------
// Declare the interleaved copu CUDA kernel
//--------------------------------------------------------------------------
template<class T> __global__ void CopyCastInterleaved(uchar* destination, const T* source, uint pitch, uint width)
{
uint2 index = make_uint2(
__umul24(blockIdx.x, blockDim.x) + threadIdx.x,
__umul24(blockIdx.y, blockDim.y) + threadIdx.y);
uint index3 = 3 * (index.y * width + index.x);
float4* dest = (float4*)(destination + index.y * pitch) + index.x;
float mult = 1.0f / Multiplier<T>();
*dest = make_float4(
mult * (float)source[index3],
mult * (float)source[index3+1],
mult * (float)source[index3+2], 1.0f);
}
//--------------------------------------------------------------------------
// Declare the typecast templated function
// This function can be called directly in C++ programs
//--------------------------------------------------------------------------
//! Allocate GPU memory and copy a voxel volume from CPU to GPU memory
//! and cast it to the normalized floating point format
//! @return the pointer to the GPU copy of the voxel volume
//! @param host pointer to the voxel volume in CPU (host) memory
//! @param width volume width in number of voxels
//! @param height volume height in number of voxels
//! @param depth volume depth in number of voxels
template<class T> extern cudaPitchedPtr CastVolumeHost3ToDevice4(const T* host, uint width, uint height, uint depth)
{
cudaPitchedPtr device = {0};
const cudaExtent extent = make_cudaExtent(width * sizeof(float4), height, depth);
CUDA_SAFE_CALL(cudaMalloc3D(&device, extent));
const size_t pitchedBytesPerSlice = device.pitch * device.ysize;
T* temp = 0;
const uint voxelsPerSlice = width * height;
const size_t nrOfBytesTemp = voxelsPerSlice * 3 * sizeof(T);
CUDA_SAFE_CALL(cudaMalloc((void**)&temp, nrOfBytesTemp));
uint dimX = min(PowTwoDivider(width), 64);
dim3 dimBlock(dimX, min(PowTwoDivider(height), 512 / dimX));
dim3 dimGrid(width / dimBlock.x, height / dimBlock.y);
size_t offsetHost = 0;
size_t offsetDevice = 0;
for (uint slice = 0; slice < depth; slice++)
{
CUDA_SAFE_CALL(cudaMemcpy(temp, host + offsetHost, nrOfBytesTemp, cudaMemcpyHostToDevice));
CopyCastInterleaved<T><<<dimGrid, dimBlock>>>((uchar*)device.ptr + offsetDevice, temp, (uint)device.pitch, width);
CUT_CHECK_ERROR("Cast kernel failed");
offsetHost += voxelsPerSlice;
offsetDevice += pitchedBytesPerSlice;
}
CUDA_SAFE_CALL(cudaFree(temp)); //free the temp GPU volume
return device;
}
#endif //_CAST_FLOAT4_H_
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment