问题
用户反映某备份软件要花费4-7分钟才能连接上,然后从 strace 中看到 rt_sigtimedwait 这个系统调用耗用了将近3分钟。
777 09:28:20.256729 rt_sigtimedwait([INT QUIT TERM XCPU XFSZ PWR], NULL, NULL, 8 <unfinished ...>
777 09:30:55.672016 <... rt_sigtimedwait resumed> ) = 2 <155.415104>
777 09:30:55.672016 <... rt_sigtimedwait resumed> ) = 2 <155.415104>
为什么 rt_sigtimedwait 会消耗大量时间呢?
rt_sigtimedwait 是做什么的?
从man手册可以得知,调用 sigtimedwait 后,进程会被暂停,直到出现了指定的信号才继续。
sigwaitinfo, sigtimedwait - synchronously wait for queued signals
sigwaitinfo() suspends execution of the calling thread until one of the signals in set is pending (If one of the signals in set is already pending for the calling thread, sigwaitinfo() will return immediately.)
sigwaitinfo() suspends execution of the calling thread until one of the signals in set is pending (If one of the signals in set is already pending for the calling thread, sigwaitinfo() will return immediately.)
所以,要解决上面用户提出的问题,首先要从应用层面去排查。sigtimedwait()只是按照应用程序的指令正常执行,耗时多是因为一直在等待信号。
示例程序
从IBM网站找到了一个示例程序,稍微修改了一下可以成功编译。
这个程序会调用 alarm(10) 来启动一个定时器,这个定时器会在10秒后发出 SIGALRM 信号。 随后,调用 sigtimedwait() 等待 SIGALRM.
#include <signal.h>
#include <stdio.h>
#include <time.h>
void catcher( int sig ) {
printf( "Signal catcher called for signal %d\n", sig );
}
void timestamp( char *str ) {
time_t t;
time( &t );
printf( "The time %s is %s\n", str, ctime(&t) );
}
int main( int argc, char *argv[] ) {
int result = 0;
struct sigaction sigact;
sigset_t waitset;
siginfo_t info;
struct timespec timeout;
sigemptyset( &sigact.sa_mask );
sigact.sa_flags = 0;
sigact.sa_handler = catcher;
sigaction( SIGALRM, &sigact, NULL );
sigemptyset( &waitset );
sigaddset( &waitset, SIGALRM );
sigprocmask( SIG_BLOCK, &waitset, NULL );
timeout.tv_sec = 30; /* Number of seconds to wait */
timeout.tv_nsec = 1000; /* Number of nanoseconds to wait */
alarm( 10 ); // Send SIGALRM after 10 sec.
timestamp( "before sigtimedwait()" );
result = sigtimedwait( &waitset, &info, &timeout );
printf("sigtimedwait() returned for signal %d\n",
info.si_signo );
timestamp( "after sigtimedwait()" );
return( result );
}
#include <stdio.h>
#include <time.h>
void catcher( int sig ) {
printf( "Signal catcher called for signal %d\n", sig );
}
void timestamp( char *str ) {
time_t t;
time( &t );
printf( "The time %s is %s\n", str, ctime(&t) );
}
int main( int argc, char *argv[] ) {
int result = 0;
struct sigaction sigact;
sigset_t waitset;
siginfo_t info;
struct timespec timeout;
sigemptyset( &sigact.sa_mask );
sigact.sa_flags = 0;
sigact.sa_handler = catcher;
sigaction( SIGALRM, &sigact, NULL );
sigemptyset( &waitset );
sigaddset( &waitset, SIGALRM );
sigprocmask( SIG_BLOCK, &waitset, NULL );
timeout.tv_sec = 30; /* Number of seconds to wait */
timeout.tv_nsec = 1000; /* Number of nanoseconds to wait */
alarm( 10 ); // Send SIGALRM after 10 sec.
timestamp( "before sigtimedwait()" );
result = sigtimedwait( &waitset, &info, &timeout );
printf("sigtimedwait() returned for signal %d\n",
info.si_signo );
timestamp( "after sigtimedwait()" );
return( result );
}
从 strace 结果看, rt_sigtimedwait() 使用了10秒来等待信号。看,这是应用程序调用的,时长也取决于什么时候收到信号,这个时间长不怪操作系统。
$ strace -Tttfv -s 8192 ./sig_test
15:52:35.291022 rt_sigtimedwait([ALRM], {si_signo=SIGALRM, si_code=SI_KERNEL, si_value={int=64550200, ptr=0x3d8f538}}, {30, 1000}, 8) = 14 <9.999445>
15:52:35.291022 rt_sigtimedwait([ALRM], {si_signo=SIGALRM, si_code=SI_KERNEL, si_value={int=64550200, ptr=0x3d8f538}}, {30, 1000}, 8) = 14 <9.999445>
参考文档
https://www.linuxprogrammingblog.com/code-examples/signal-waiting-sigtimedwait
https://www.ibm.com/support/knowledgecenter/ja/ssw_i5_54/apis/sigtwait.htm