由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
Computation版 - mpi 的简单程序问题
相关主题
想用系里的高性能计算机,请问怎么知道我需要多少核?请教各位大下, MPI并行计算的问题
a question about MPI_Barrier[合集] 如何在多nodes上运行1000多个单独的程序? (转载)
大家作并行计算的时候都是怎么作load balancing的?[合集] MPI的问题
C++的程序,想改成并行计算,用什么好?High Performance Computing software engineer wanted @ Cary NC
问个MPI问题(有包子)大家为什么用beowulf?
A website for MPI大家用什么做平行计算
any body debugged MPI programms?who know what is the advantage of UMS to MPI or SMP?
a question about MPIRe: f90 and mpi
相关话题的讨论汇总
话题: mpi话题: cores话题: seconds话题: threads话题: send
进入Computation版参与讨论
1 (共1页)
r***e
发帖数: 2000
1
Sorry for this stupid question.
A very simple/short MPI program runs on a 4 core
desktop computer. I would expect the Speed-up to
drop after 8 (HT) threads, but it continues almost
in a straight line until 128 threads (the default
system limit).
How is this possible? Thanks.
j**u
发帖数: 6059
2
if there are not much communication between cores,

【在 r***e 的大作中提到】
: Sorry for this stupid question.
: A very simple/short MPI program runs on a 4 core
: desktop computer. I would expect the Speed-up to
: drop after 8 (HT) threads, but it continues almost
: in a straight line until 128 threads (the default
: system limit).
: How is this possible? Thanks.

r***e
发帖数: 2000
3

I still don't get it.
On a 4 core i7 processor with 8 hyper-threads,
say for an embarrassingly parallel problem
the sequential time is 1024 seconds
then 2 cores will take 512 seconds each, S = 2;
4 cores will take 256 seconds each, s = 4;
8 cores will take 128 seconds each, s = 8;
16 cores will take 64 seconds each,
however, as 8 have to wait,
it still takes 64+64 = 128 seconds, s = 8;
32 cores, s = 8;
But what I am getting is this:

【在 j**u 的大作中提到】
: if there are not much communication between cores,
r***e
发帖数: 2000
4

Can someone help please? I am really confused.
I pasted the source code here:
http://pastebin.com/5rUN5Vfm

【在 r***e 的大作中提到】
:
: I still don't get it.
: On a 4 core i7 processor with 8 hyper-threads,
: say for an embarrassingly parallel problem
: the sequential time is 1024 seconds
: then 2 cores will take 512 seconds each, S = 2;
: 4 cores will take 256 seconds each, s = 4;
: 8 cores will take 128 seconds each, s = 8;
: 16 cores will take 64 seconds each,
: however, as 8 have to wait,

x*x
发帖数: 365
5
你的计时方法不对,在开始计时和开始计算之间加一个Barrier再看看。
r***e
发帖数: 2000
6

Thank you for looking into this, I add a Barrier
immediately after the start of the clock, and the
result is the same.

【在 x*x 的大作中提到】
: 你的计时方法不对,在开始计时和开始计算之间加一个Barrier再看看。
x*x
发帖数: 365
7
你的工作量的分配也是不对的,不管循环中num_nodes是多少,所有的rank(0-world_
size-1)都在进行同样的计算,结果是总工作量与num_nodes成反比,自然你会看到线性
加速了。正确的做法是只有rank 0 到rank num_nodes-1参加计算,这样才能维持总计
算量不变。
r***e
发帖数: 2000
8

没有啊,每个node的工作量是(n/num_nodes)。

【在 x*x 的大作中提到】
: 你的工作量的分配也是不对的,不管循环中num_nodes是多少,所有的rank(0-world_
: size-1)都在进行同样的计算,结果是总工作量与num_nodes成反比,自然你会看到线性
: 加速了。正确的做法是只有rank 0 到rank num_nodes-1参加计算,这样才能维持总计
: 算量不变。

x*x
发帖数: 365
9
可是不止num_nodes个node在工作啊

【在 r***e 的大作中提到】
:
: 没有啊,每个node的工作量是(n/num_nodes)。

r***e
发帖数: 2000
10

明白了,谢谢!

【在 x*x 的大作中提到】
: 可是不止num_nodes个node在工作啊
r***e
发帖数: 2000
11
Old bear, may I bother you with another naïve question?
I tried to isolate a problem, so I wrote this short program to
test basic send and receive.
http://pastebin.com/CMp63hkK
It works as expected on one computer Fedora 22 with OpenMPI,
but "always" hangs on another computer with Fedora 22 with MPICH.
I tested them on localhost only in both cases, same gcc version.
If I use reduce or if I avoid send/receive same node (0),
then it works.
Is it a rule that I can't send/receive the same node (0),
or is there a mistake in my code?
Thank you!
x*x
发帖数: 365
12
MPI_Send has blocking semantics. It may or may not block depending on the
MPI implementation.Both OpenMPI and MPICH behaved correctly.The problem is
with the program.
The correct way is to use MPI_Isend instead.

【在 r***e 的大作中提到】
: Old bear, may I bother you with another naïve question?
: I tried to isolate a problem, so I wrote this short program to
: test basic send and receive.
: http://pastebin.com/CMp63hkK
: It works as expected on one computer Fedora 22 with OpenMPI,
: but "always" hangs on another computer with Fedora 22 with MPICH.
: I tested them on localhost only in both cases, same gcc version.
: If I use reduce or if I avoid send/receive same node (0),
: then it works.
: Is it a rule that I can't send/receive the same node (0),

r***e
发帖数: 2000
13

谢谢!

【在 x*x 的大作中提到】
: MPI_Send has blocking semantics. It may or may not block depending on the
: MPI implementation.Both OpenMPI and MPICH behaved correctly.The problem is
: with the program.
: The correct way is to use MPI_Isend instead.

1 (共1页)
进入Computation版参与讨论
相关主题
Re: f90 and mpi问个MPI问题(有包子)
P P C L Beowulf计算机群测试报告A website for MPI
并行的算法any body debugged MPI programms?
mpi_gathera question about MPI
想用系里的高性能计算机,请问怎么知道我需要多少核?请教各位大下, MPI并行计算的问题
a question about MPI_Barrier[合集] 如何在多nodes上运行1000多个单独的程序? (转载)
大家作并行计算的时候都是怎么作load balancing的?[合集] MPI的问题
C++的程序,想改成并行计算,用什么好?High Performance Computing software engineer wanted @ Cary NC
相关话题的讨论汇总
话题: mpi话题: cores话题: seconds话题: threads话题: send