Cuda Hello World printf not working even with -arch=sm_20(即使使用 -arch=sm_20,Cuda Hello World printf 也无法正常工作)
问题描述
我不认为我是 Cuda 的新手,但显然我是.
I didn't think I was a complete newbie with Cuda, but apparently I am.
我最近将我的 cuda 设备升级到了功能 1.3 到 2.1 (Geforce GT 630).我还想对 Cuda 工具包 5.0 进行全面升级.
I recently upgraded my cuda device to one capable capability 1.3 to 2.1 (Geforce GT 630). I thought to do a full upgrade to Cuda toolkit 5.0 as well.
我可以编译通用 cuda 内核,但 printf 即使设置了 -arch=sm_20 也无法正常工作.
I can compile general cuda kernels, but printf is not working even with -arch=sm_20 set.
代码:
#include <stdio.h>
#include <assert.h>
#include <cuda.h>
#include <cuda_runtime.h>
__global__ void test(){
printf("Hi Cuda World");
}
int main( int argc, char** argv )
{
test<<<1,1>>>();
return 0;
}
编译器:
Error 2 error MSB3721: The command ""C:Program FilesNVIDIA GPU Computing ToolkitCUDAv5.0in
vcc.exe" -gencode=arch=compute_10,code="sm_20,compute_10" --use-local-env --cl-version 2010 -ccbin "C:Program Files (x86)Microsoft Visual Studio 10.0VCin" -I"C:Program FilesNVIDIA GPU Computing ToolkitCUDAv5.0include" -I"C:Program FilesNVIDIA GPU Computing ToolkitCUDAv5.0include" -G --keep-dir "Debug" -maxrregcount=0 --machine 32 --compile -arch=sm_20 -g -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd " -o "Debugmain.cu.obj" "d:userstoredocumentsvisual studio 2010Projects estCuda estCudamain.cu"" exited with code 2. C:Program Files (x86)MSBuildMicrosoft.Cppv4.0BuildCustomizationsCUDA 5.0.targets 592 10 testCuda
Error 1 error : calling a __host__ function("printf") from a __global__ function("test") is not allowed d:userstoredocumentsvisual studio 2010Projects estCuda estCudamain.cu 9 1 testCuda
由于这个问题,我的生活即将结束……完成了.请在屋顶上告诉我答案.
I'm about done with life because of this problem...done done done. Please talk me down from the rooftops with an answer.
推荐答案
内核中 printf 仅在计算能力 2 或更高的硬件中支持.因为您的项目设置为为 both 计算能力 1.0 和计算 2.1 构建,所以 nvcc 会多次编译代码并构建多架构 fatbinary 对象.错误是在计算能力 1.0 编译周期生成的,因为 该架构不支持 printf
调用.
In kernel printf is only supported in compute capability 2 or higher hardware. Because your project is set to build for both compute capability 1.0 and compute 2.1, nvcc compiles the code multiple times and builds a multi-architecture fatbinary object. It is during the compute capability 1.0 compilation cycle that the error is being generated, because the printf
call is unsupported for that architecture.
如果您从项目中删除计算能力 1.0 构建目标,错误将消失.
If you remove the compute capability 1.0 build target from your project, the error will disappear.
你也可以这样写内核:
__global__ void test()
{
#if __CUDA_ARCH__ >= 200
printf("Hi Cuda World");
#endif
}
__CUDA_ARCH__
符号只会在为计算能力 2.0 或高目标构建时为 >= 200,这将允许您为计算能力 1.x 设备编译此代码而不会遇到语法错误.
The __CUDA_ARCH__
symbol will only be >= 200 when building for compute capability 2.0 or high targets and this would allow you to compile this code for compute capability 1.x devices without encountering a syntax error.
当为正确的架构编译并且没有输出时,您还需要确保内核完成并且驱动程序刷新输出缓冲区.为此,在主机代码中内核启动后添加一个同步调用
When compiling for the correct architecture and getting no output, you also need to ensure that the kernel finishes and the driver flushes the output buffer. To do this add a synchronizing call after the kernel launch in the host code
例如:
int main( int argc, char** argv )
{
test<<<1,1>>>();
cudaDeviceSynchronize();
return 0;
}
[免责声明:所有代码在浏览器中编写,从未编译,使用风险自负]
[disclaimer: all code written in browser, never compiled, use at own risk]
如果你同时做这两件事,你应该能够编译、运行并查看输出.
If you do both things, you should be able to compile, run and see output.
这篇关于即使使用 -arch=sm_20,Cuda Hello World printf 也无法正常工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:即使使用 -arch=sm_20,Cuda Hello World printf 也无法正常工作
基础教程推荐
- 从 std::cin 读取密码 2021-01-01
- 为什么语句不能出现在命名空间范围内? 2021-01-01
- 为 C/C++ 中的项目的 makefile 生成依赖项 2022-01-01
- Windows Media Foundation 录制音频 2021-01-01
- 如何“在 Finder 中显示"或“在资源管理器中显 2021-01-01
- 管理共享内存应该分配多少内存?(助推) 2022-12-07
- 如何在不破坏 vtbl 的情况下做相当于 memset(this, ...) 的操作? 2022-01-01
- 在 C++ 中循环遍历所有 Lua 全局变量 2021-01-01
- 使用从字符串中提取的参数调用函数 2022-01-01
- 如何使图像调整大小以在 Qt 中缩放? 2021-01-01