什么更有效?使用 pow 平方或只是乘以它自己?-C/C++问题

What is more efficient? Using pow to square or just multiply it with itself?(什么更有效?使用 pow 平方或只是乘以它自己?)

本文介绍了什么更有效?使用 pow 平方或只是乘以它自己?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这两种方法在C中哪个更有效?以及如何:

What of these two methods is in C more efficient? And how about:

pow(x,3)


对比
x*x*x // etc?


推荐答案
UPDATE 2021
我修改了基准代码如下:
I've modified the benchmark code as follows:
std::chrono 用于计时测量而不是 boost
C++11  代替 rand()
避免重复操作可能会被吊出.基本参数不断变化.
我使用 GCC 10 -O2 得到以下结果(以秒为单位):
I get the following results with GCC 10 -O2 (in seconds):
exp     c++ pow     c pow       x*x*x...
2       0.204243    1.39962     0.0902527   
3       1.36162     1.38291     0.107679    
4       1.37717     1.38197     0.106103    
5       1.3815      1.39139     0.117097

GCC 10 -O3 几乎与 GCC 10 -O2 相同.
GCC 10 -O3 is almost identical to GCC 10 -O2.
使用 GCC 10 -O2 -ffast-math:
With GCC 10 -O2 -ffast-math:
exp     c++ pow     c pow       x*x*x...
2       0.203625    1.4056      0.0913414   
3       0.11094     1.39938     0.108027    
4       0.201593    1.38618     0.101585    
5       0.102141    1.38212     0.10662

使用 GCC 10 -O3 -ffast-math:
With GCC 10 -O3 -ffast-math:
exp     c++ pow     c pow       x*x*x...
2       0.0451995   1.175       0.0450497   
3       0.0470842   1.20226     0.051399    
4       0.0475239   1.18033     0.0473844   
5       0.0522424   1.16817     0.0522291

使用 Clang 12 -O2:
With Clang 12 -O2:
exp     c++ pow     c pow       x*x*x...
2       0.106242    0.105435    0.105533    
3       1.45909     1.4425      0.102235    
4       1.45629     1.44262     0.108861    
5       1.45837     1.44483     0.1116

Clang 12 -O3 几乎与 Clang 12 -O2 相同.
Clang 12 -O3 is almost identical to Clang 12 -O2.
使用 Clang 12 -O2 -ffast-math:
With Clang 12 -O2 -ffast-math:
exp     c++ pow     c pow       x*x*x...
2       0.0233731   0.0232457   0.0231076   
3       0.0271074   0.0266663   0.0278415   
4       0.026897    0.0270698   0.0268115   
5       0.0312481   0.0296402   0.029811    

Clang 12 -O3 -ffast-math 几乎与 Clang 12 -O2 -ffast-math 相同.
Clang 12 -O3 -ffast-math is almost identical to Clang 12 -O2 -ffast-math.
机器是 Linux 5.4.0-73-generic x86_64 上的 Intel Core i7-7700K.
Machine is Intel Core i7-7700K on Linux 5.4.0-73-generic x86_64.
结论:

使用 GCC 10(无 -ffast-math)，x*x*x... 总是更快

使用 GCC 10 -O2 -ffast-math，std::pow 和 x*x*x... 对于odd 一样快em> 指数
使用 GCC 10 -O3 -ffast-math，对于所有测试用例，std::pow 与 x*x*x... 一样快，并且是大约是 -O2 的两倍.
使用 GCC 10，C 的 pow(double, double) 总是慢得多
使用 Clang 12(无 -ffast-math)，x*x*x... 对于大于 2 的指数会更快
使用 Clang 12 -ffast-math，所有方法都会产生相似的结果
在 Clang 12 中，pow(double, double) 与 std::pow 对于整数指数一样快
在没有让编译器比你聪明的情况下编写基准测试是困难的.

With GCC 10 (no -ffast-math), x*x*x... is always faster

With GCC 10 -O2 -ffast-math, std::pow is as fast as x*x*x... for odd exponents

With GCC 10 -O3 -ffast-math, std::pow is as fast as x*x*x... for all test cases, and is around twice as fast as -O2.

With GCC 10, C's pow(double, double) is always much slower

With Clang 12 (no -ffast-math), x*x*x... is faster for exponents greater than 2

With Clang 12 -ffast-math, all methods produce similar results

With Clang 12, pow(double, double) is as fast as std::pow for integral exponents

Writing benchmarks without having the compiler outsmart you is hard.

我最终会在我的机器上安装更新版本的 GCC，并在我这样做时更新我的结果.

I'll eventually get around to installing a more recent version of GCC on my machine and will update my results when I do so.

这是更新的基准代码:

#include <cmath> #include <chrono> #include <iostream> #include <random> using Moment = std::chrono::high_resolution_clock::time_point; using FloatSecs = std::chrono::duration<double>; inline Moment now() { return std::chrono::high_resolution_clock::now(); } #define TEST(num, expression) double test##num(double b, long loops) { double x = 0.0; auto startTime = now(); for (long i=0; i<loops; ++i) { x += expression; b += 1.0; } auto elapsed = now() - startTime; auto seconds = std::chrono::duration_cast<FloatSecs>(elapsed); std::cout << seconds.count() << " "; return x; } TEST(2, b*b) TEST(3, b*b*b) TEST(4, b*b*b*b) TEST(5, b*b*b*b*b) template <int exponent> double testCppPow(double base, long loops) { double x = 0.0; auto startTime = now(); for (long i=0; i<loops; ++i) { x += std::pow(base, exponent); base += 1.0; } auto elapsed = now() - startTime; auto seconds = std::chrono::duration_cast<FloatSecs>(elapsed); std::cout << seconds.count() << " "; return x; } double testCPow(double base, double exponent, long loops) { double x = 0.0; auto startTime = now(); for (long i=0; i<loops; ++i) { x += ::pow(base, exponent); base += 1.0; } auto elapsed = now() - startTime; auto seconds = std::chrono::duration_cast<FloatSecs>(elapsed); std::cout << seconds.count() << " "; return x; } int main() { using std::cout; long loops = 100000000l; double x = 0; std::random_device rd; std::default_random_engine re(rd()); std::uniform_real_distribution<double> dist(1.1, 1.2); cout << "exp c++ pow c pow x*x*x..."; cout << " 2 "; double b = dist(re); x += testCppPow<2>(b, loops); x += testCPow(b, 2.0, loops); x += test2(b, loops); cout << " 3 "; b = dist(re); x += testCppPow<3>(b, loops); x += testCPow(b, 3.0, loops); x += test3(b, loops); cout << " 4 "; b = dist(re); x += testCppPow<4>(b, loops); x += testCPow(b, 4.0, loops); x += test4(b, loops); cout << " 5 "; b = dist(re); x += testCppPow<5>(b, loops); x += testCPow(b, 5.0, loops); x += test5(b, loops); std::cout << " " << x << " "; }

旧答案，2010 年

我使用此代码测试了 x*x*... 与 pow(x,i) 对于小型 i 之间的性能差异:

I tested the performance difference between x*x*... vs pow(x,i) for small i using this code:

#include <cstdlib> #include <cmath> #include <boost/date_time/posix_time/posix_time.hpp> inline boost::posix_time::ptime now() { return boost::posix_time::microsec_clock::local_time(); } #define TEST(num, expression) double test##num(double b, long loops) { double x = 0.0; boost::posix_time::ptime startTime = now(); for (long i=0; i<loops; ++i) { x += expression; x += expression; x += expression; x += expression; x += expression; x += expression; x += expression; x += expression; x += expression; x += expression; } boost::posix_time::time_duration elapsed = now() - startTime; std::cout << elapsed << " "; return x; } TEST(1, b) TEST(2, b*b) TEST(3, b*b*b) TEST(4, b*b*b*b) TEST(5, b*b*b*b*b) template <int exponent> double testpow(double base, long loops) { double x = 0.0; boost::posix_time::ptime startTime = now(); for (long i=0; i<loops; ++i) { x += std::pow(base, exponent); x += std::pow(base, exponent); x += std::pow(base, exponent); x += std::pow(base, exponent); x += std::pow(base, exponent); x += std::pow(base, exponent); x += std::pow(base, exponent); x += std::pow(base, exponent); x += std::pow(base, exponent); x += std::pow(base, exponent); } boost::posix_time::time_duration elapsed = now() - startTime; std::cout << elapsed << " "; return x; } int main() { using std::cout; long loops = 100000000l; double x = 0.0; cout << "1 "; x += testpow<1>(rand(), loops); x += test1(rand(), loops); cout << " 2 "; x += testpow<2>(rand(), loops); x += test2(rand(), loops); cout << " 3 "; x += testpow<3>(rand(), loops); x += test3(rand(), loops); cout << " 4 "; x += testpow<4>(rand(), loops); x += test4(rand(), loops); cout << " 5 "; x += testpow<5>(rand(), loops); x += test5(rand(), loops); cout << " " << x << " "; }

结果是:

1 00:00:01.126008 00:00:01.128338 2 00:00:01.125832 00:00:01.127227 3 00:00:01.125563 00:00:01.126590 4 00:00:01.126289 00:00:01.126086 5 00:00:01.126570 00:00:01.125930 2.45829e+54

请注意，我累积了每次 pow 计算的结果，以确保编译器不会对其进行优化.

Note that I accumulate the result of every pow calculation to make sure the compiler doesn't optimize it away.

如果我使用 std::pow(double, double) 版本，并且 loops = 1000000l，我得到:

If I use the std::pow(double, double) version, and loops = 1000000l, I get:

1 00:00:00.011339 00:00:00.011262 2 00:00:00.011259 00:00:00.011254 3 00:00:00.975658 00:00:00.011254 4 00:00:00.976427 00:00:00.011254 5 00:00:00.973029 00:00:00.011254 2.45829e+52

这是在运行 Ubuntu 9.10 64 位的 Intel Core Duo 上.使用带有 -o2 优化的 gcc 4.4.1 编译.

This is on an Intel Core Duo running Ubuntu 9.10 64bit. Compiled using gcc 4.4.1 with -o2 optimization.

所以在 C 中，是的 x*x*x 会比 pow(x, 3) 快，因为没有 pow(double, int) 重载.在 C++ 中，它大致相同.(假设我的测试方法是正确的.)

So in C, yes x*x*x will be faster than pow(x, 3), because there is no pow(double, int) overload. In C++, it will be the roughly same. (Assuming the methodology in my testing is correct.)

这是对 An Markm 的评论的回应:

This is in response to the comment made by An Markm:

即使发出了 using namespace std 指令，如果 pow 的第二个参数是 int，那么 std::pow(double, int) 来自 <cmath> 的重载将被调用，而不是来自 < 的 ::pow(double, double);math.h>.

Even if a using namespace std directive was issued, if the second parameter to pow is an int, then the std::pow(double, int) overload from <cmath> will be called instead of ::pow(double, double) from <math.h>.

此测试代码确认了该行为:

This test code confirms that behavior:

#include <iostream> namespace foo { double bar(double x, int i) { std::cout << "foo::bar "; return x*i; } } double bar(double x, double y) { std::cout << "::bar "; return x*y; } using namespace foo; int main() { double a = bar(1.2, 3); // Prints "foo::bar" std::cout << a << " "; return 0; }

这篇关于什么更有效?使用 pow 平方或只是乘以它自己?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持编程学习网！

问题描述

推荐答案

基础教程推荐