2d char array to CUDA kernel(二维字符数组到 CUDA 内核)
问题描述
我需要帮助将 char[][] 转移到 Cuda 内核.这是我的代码:
I need help with transfer char[][] to Cuda kernel. This is my code:
__global__
void kernel(char** BiExponent){
for(int i=0; i<500; i++)
printf("%c",BiExponent[1][i]); // I want print line 1
}
int main(){
char (*Bi2dChar)[500] = new char [5000][500];
char **dev_Bi2dChar;
...//HERE I INPUT DATA TO Bi2dChar
size_t host_orig_pitch = 500 * sizeof(char);
size_t pitch;
cudaMallocPitch((void**)&dev_Bi2dChar, &pitch, 500 * sizeof(char), 5000);
cudaMemcpy2D(dev_Bi2dChar, pitch, Bi2dChar, host_orig_pitch, 500 * sizeof(char), 5000, cudaMemcpyHostToDevice);
kernel <<< 1, 512 >>> (dev_Bi2dChar);
free(Bi2dChar); cudaFree(dev_Bi2dChar);
}
我使用:nvcc.exe" -gencode=arch=compute_20,code="sm_20,compute_20" --use-local-env --cl-version 2012 -ccbin
I use: nvcc.exe" -gencode=arch=compute_20,code="sm_20,compute_20" --use-local-env --cl-version 2012 -ccbin
感谢您的帮助.
推荐答案
cudaMemcpy2D
实际上并不处理二维(即双指针,**
)数组C.请注意,文档 表明它需要单个指针,不是双指针.
cudaMemcpy2D
doesn't actually handle 2-dimensional (i.e. double pointer, **
) arrays in C.
Note that the documentation indicates it expects single pointers, not double pointers.
一般来说,在主机和设备之间移动任意双指针 C 数组比单指针数组更复杂.
Generally speaking, moving arbitrary double pointer C arrays between the host and the device is more complicated than a single pointer array.
如果你真的想处理双指针数组,那么在这个页面的右上角搜索CUDA 2D Array",你会发现如何做的各种例子.(例如,@talonmies 给出的答案这里)
If you really want to handle the double-pointer array, then search on "CUDA 2D Array" in the upper right hand corner of this page, and you'll find various examples of how to do it. (For example, the answer given by @talonmies here)
通常,更简单的方法是简单地展平"数组,以便它可以被单个指针引用,即 char[]
而不是 char[][]
,然后使用索引算法来模拟二维访问.
Often, an easier approach is simply to "flatten" the array so it can be referenced by a single pointer, i.e. char[]
instead of char[][]
, and then use index arithmetic to simulate 2-dimensional access.
您的扁平化代码如下所示:(您提供的代码是不可编译的、不完整的代码段,我的也是)
Your flattened code would look something like this: (the code you provided is an uncompilable, incomplete snippet, so mine is also)
#define XDIM 5000
#define YDIM 500
__global__
void kernel(char* BiExponent){
for(int i=0; i<500; i++)
printf("%c",BiExponent[(1*XDIM)+i]); // I want print line 1
}
int main(){
char (*Bi2dChar)[YDIM] = new char [XDIM][YDIM];
char *dev_Bi2dChar;
...//HERE I INPUT DATA TO Bi2dChar
cudaMalloc((void**)&dev_Bi2dChar,XDIM*YDIM * sizeof(char));
cudaMemcpy(dev_Bi2dChar, &(Bi2dChar[0][0]), host_orig_pitch, XDIM*YDIM * sizeof(char), cudaMemcpyHostToDevice);
kernel <<< 1, 512 >>> (dev_Bi2dChar);
free(Bi2dChar); cudaFree(dev_Bi2dChar);
}
如果你想要一个有间距的数组,你可以类似地创建它,但你仍然会这样做作为单指针数组,而不是双指针数组.
If you want a pitched array, you can create it similarly, but you will still do so as single pointer arrays, not double pointer arrays.
这篇关于二维字符数组到 CUDA 内核的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:二维字符数组到 CUDA 内核
基础教程推荐
- 使用从字符串中提取的参数调用函数 2022-01-01
- 为 C/C++ 中的项目的 makefile 生成依赖项 2022-01-01
- 为什么语句不能出现在命名空间范围内? 2021-01-01
- 如何“在 Finder 中显示"或“在资源管理器中显 2021-01-01
- 从 std::cin 读取密码 2021-01-01
- 如何使图像调整大小以在 Qt 中缩放? 2021-01-01
- 如何在不破坏 vtbl 的情况下做相当于 memset(this, ...) 的操作? 2022-01-01
- 管理共享内存应该分配多少内存?(助推) 2022-12-07
- Windows Media Foundation 录制音频 2021-01-01
- 在 C++ 中循环遍历所有 Lua 全局变量 2021-01-01