mayaview 发表于 2014-2-14 17:26

Scipy和Numpy中的特征值函数的效率

http://forum.vibunion.com/thread-123356-1-1.html 中Rainyboy讨论了Scipy和Numpy里特征值分解函数的适用范围。碰巧我前段时间也看过相关的内容,讲一讲这块的几个函数的性能吧!

照理来说,Scipy和Numpy里的特征值分解都进行了很强的优化,速度和Matlab的特征值分解是一致的,当然和C或者Fortran的速度也是一致的的,因为用得是相同的函数库。是不是这样呢?+

在我的机器上的运算结果是这样的:
In :

size=1000
size
Out:
1000
In :

%%timeit
A=np.random.random((size,size))
100 loops, best of 3: 12 ms per loop
In :

A=np.random.random((size,size))
B=np.random.random((size,1))
In :

%%timeit
A.dot(B)
10000 loops, best of 3: 106 μs per loop
In :

%%timeit
A.dot(A)
10 loops, best of 3: 26.9 ms per loop
In :

A=A.dot(A.T)
In :

%timeit np.linalg.eig(A)
1 loops, best of 3: 1.04 s per loop
In :

%timeit np.linalg.eigh(A)
10 loops, best of 3: 159 ms per loop
In :

%timeit np.linalg.eigvals(A)
1 loops, best of 3: 552 ms per loop
In :

%timeit np.linalg.eigvalsh(A)
10 loops, best of 3: 72.5 ms per loop
In :

%timeit sp.linalg.eig(A)
1 loops, best of 3: 994 ms per loop
In :

%timeit sp.linalg.eigh(A)
10 loops, best of 3: 177 ms per loop
In :

%timeit sp.linalg.eigvals(A)
1 loops, best of 3: 553 ms per loop
In :

%timeit sp.linalg.eigvalsh(A)
10 loops, best of 3: 75.3 ms per loop
In :

%timeit sp.linalg.eigvals_banded(A)
1 loops, best of 3: 889 ms per loop
In :

%timeit sp.linalg.eig_banded(A)
1 loops, best of 3: 1.14 s per loop


同样在Matlab里等效的代码:

size=1000;
disp('Random:');
tic; A=rand(size);toc
b=rand(size,1);
disp('Matrix dot Vector:');
tic; A*b; toc
disp('Matrix dot Matrix:');
tic; A*A; toc
A=A*A';
disp('Eig:');
tic;=eig(A);toc
disp('Eigvalue:');
tic;e=eig(A);toc


Random:
Elapsed time is 0.012240 seconds.
Matrix dot Vector:
Elapsed time is 0.011468 seconds.
Matrix dot Matrix:
Elapsed time is 0.045180 seconds.
Eig:
Elapsed time is 0.148208 seconds.
Eigvalue:
Elapsed time is 0.065679 seconds.



把size提升到5000,结果是这样的:

Random:
Elapsed time is 0.259863 seconds.
Matrix dot Vector:
Elapsed time is 0.010714 seconds.
Matrix dot Matrix:
Elapsed time is 2.755659 seconds.
Eig:
Elapsed time is 17.550121 seconds.
Eigvalue:
Elapsed time is 12.204438 seconds.


对应的Python的结果是:
size=5000
size
Out:
5000
In :

%%timeit
A=np.random.random((size,size))
1 loops, best of 3: 294 ms per loop
In :

A=np.random.random((size,size))
B=np.random.random((size,1))
In :

%%timeit
A.dot(B)
100 loops, best of 3: 9.58 ms per loop
In :

%%timeit
A.dot(A)
1 loops, best of 3: 2.73 s per loop
In :

A=A.dot(A.T)
In :

%timeit np.linalg.eig(A)
1 loops, best of 3: 1min 49s per loop
In :

%timeit np.linalg.eigh(A)
1 loops, best of 3: 17.3 s per loop
In :

%timeit np.linalg.eigvals(A)
1 loops, best of 3: 45.8 s per loop
In :

%timeit np.linalg.eigvalsh(A)
1 loops, best of 3: 11.7 s per loop
In :

%timeit sp.linalg.eig(A)
1 loops, best of 3: 1min 47s per loop
In :

%timeit sp.linalg.eigh(A)
1 loops, best of 3: 20.3 s per loop
In :

%timeit sp.linalg.eigvals(A)
1 loops, best of 3: 47.3 s per loop
In:%timeit sp.linalg.eigvalsh(A)1 loops, best of 3: 12.5 s per loop


结论很明显了,我先不总结了,大家先讨论讨论吧。统计一下重要的时间:

SizeMatlabPython
eig eigvalueM dot VM dot Mrandnp.eignp.eighnp.eigvalsnp.eigvalshsp.eigsp.eighM dot VM dot Mrand
1000148.2165.6811.4745.1812.24104015955272.59941770.10626.912
500017550.1212204.4410.712755.66259.86119000173004580011700117000203009.582730294

PS:排版很困难啊,怎么回事?直接上纯文本吧!

Rainyboy 发表于 2014-2-17 20:36

在使用中确实有这个问题,随着规模的增大,python那几个库函数算得很慢了。

mayaview 发表于 2014-2-17 21:12

Rainyboy 发表于 2014-2-17 20:36
在使用中确实有这个问题,随着规模的增大,python那几个库函数算得很慢了。

应该没有这个问题,Matlab可以自动检测矩阵是不是对称的,虽然我不确定它是怎么做到的,但是从计算结果来看Matlab确实使用了矩阵是对称的条件(放一个非对称阵进去,Matlab就会自动用非对称阵的算法,完全不知道是怎么办到的)。Scipy和Numpy里必须显式的告诉特征值函数矩阵是对称的,求解速度才能上来,如果使用对应一般矩阵的算法就会很慢。
页: [1]
查看完整版本: Scipy和Numpy中的特征值函数的效率