CUDA external class linkage and unresolved extern function in ptxas file(ptxas 文件中的 CUDA 外部类链接和未解析的外部函数)
问题描述
I'm working with CUDA and I have created an int2_
class to deal with complex integer numbers.
Class declarations in the ComplexTypes.h
file as follows:
namespace LibraryNameSpace
{
class int2_ {
public:
int x;
int y;
// Constructors
__host__ __device__ int2_(const int,const int);
__host__ __device__ int2_();
// etc.
// Equalities with other types
__host__ __device__ const int2_& operator=(const int);
__host__ __device__ const int2_& operator=(const float);
// etc.
};
}
Class implementations in the ComplexTypes.cpp
file as follows:
#include "ComplexTypes.h"
__host__ __device__ LibraryNameSpace::int2_::int2_(const int x_,const int y_) { x=x_; y=y_;}
__host__ __device__ LibraryNameSpace::int2_::int2_() {}
// etc.
__host__ __device__ const LibraryNameSpace::int2_& LibraryNameSpace::int2_::operator=(const int a) { x = a; y = 0.; return *this; }
__host__ __device__ const LibraryNameSpace::int2_& LibraryNameSpace::int2_::operator=(const float a) { x = (int)a; y = 0.; return *this; }
// etc.
Everything works well. In the main
(which includes ComplexTypes.h
) I could deal with int2_
numbers.
In the CudaMatrix.cu
file, I'm now including ComplexTypes.h
and defining and properly instantiating the __global__
function:
template <class T1, class T2>
__global__ void evaluation_matrix(T1* data_, T2* ob, int NumElements)
{
const int i = blockDim.x * blockIdx.x + threadIdx.x;
if(i < NumElements) data_[i] = ob[i];
}
template __global__ void evaluation_matrix(LibraryNameSpace::int2_*,int*,int);
The situation of the CudaMatrix.cu
file seems to be symmetric to the main
function. Nevertheless, the compiler complains:
Error 19 error : Unresolved extern function '_ZN16LibraryNameSpace5int2_aSEi' C:UsersDocumentsProjectTestTesting_Filesptxas simpleTest
Please, consider that:
- Before moving the implementation to separate files, everything was working correctly when including both declarations and implementations in the
main
file. - The problematic instruction is
data_[i] = ob[i]
.
Anyone has an idea of what is going on?
The procedure I have followed in my post above has two issues:
The
ComplexTypes.cpp
filename must be turned toComplexTypes.cu
so thatnvcc
could intercept the CUDA keywords__device__
and__host__
. This has been pointed out by Talonmies in his comment. Actually, before posting, I was already changing the filename from.cpp
to.cu
, but the compiler was complaining and showing the same error. Therefore, I was ingenuously stepping back;In Visual Studio 2010, one has to use View -> Property Pages; Configuration Properties -> CUDA C/C++ -> Common -> Generate Relocatable Device Code -> Yes (-rdc=true). This is necessary for separate compilation. Indeed, at NVIDIA CUDA Compiler Driver NVCC, it is said that:
CUDA works by embedding device code into host objects. In whole program compilation, it embeds executable device code into the host object. In separate compilation, we embed relocatable device code into the host object, and run the device linker (nvlink) to link all the device code together. The output of nvlink is then linked together with all the host objects by the host linker to form the final executable. The generation of relocatable vs executable device code is controlled by the --relocatable-device-code={true,false} option, which can be shortened to –rdc={true,false}.
这篇关于ptxas 文件中的 CUDA 外部类链接和未解析的外部函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:ptxas 文件中的 CUDA 外部类链接和未解析的外部函数
基础教程推荐
- 为 C/C++ 中的项目的 makefile 生成依赖项 2022-01-01
- Windows Media Foundation 录制音频 2021-01-01
- 从 std::cin 读取密码 2021-01-01
- 使用从字符串中提取的参数调用函数 2022-01-01
- 在 C++ 中循环遍历所有 Lua 全局变量 2021-01-01
- 管理共享内存应该分配多少内存?(助推) 2022-12-07
- 如何“在 Finder 中显示"或“在资源管理器中显 2021-01-01
- 如何在不破坏 vtbl 的情况下做相当于 memset(this, ...) 的操作? 2022-01-01
- 如何使图像调整大小以在 Qt 中缩放? 2021-01-01
- 为什么语句不能出现在命名空间范围内? 2021-01-01