1

I have a fortran program which I compiled myself and I ran the executable hundreds of times (without recompiling or anything), but now when I run it, it crashes instantly with segmentation fault. There are three other instances of the program running right now. top outputs the following:

top - 15:37:06 up 5 days, 1:06, 2 users, load average: 3,00, 3,01, 3,06 Tasks: 290 total, 4 running, 285 sleeping, 0 stopped, 1 zombie %Cpu(s): 24,4 us, 0,0 sy, 0,0 ni, 75,5 id, 0,1 wa, 0,0 hi, 0,0 si, 0,0 st KiB Mem : 8058952 total, 2409096 free, 2964692 used, 2685164 buff/cache KiB Swap: 8263676 total, 8263676 free, 0 used. 4614096 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1230 user 20 0 12,329g 675720 3080 R 100,0 8,4 14:17.45 tetramer 1236 user 20 0 12,329g 675688 3052 R 100,0 8,4 13:58.96 tetramer 1234 user 20 0 12,329g 675800 3168 R 100,0 8,4 14:02.23 tetramer 

It does use a lot of memory (at least virtual memory), but up till now it used to be possible to run several instances at the same time as long as the actual memory usage was low enough. The fortran code in question follows, it crashes before getting to the write.

 IMPLICIT REAL*8(A-H,O-Z) c PARAMETER ( np = 220 ) c PARAMETER ( ndim = 25000) PARAMETER ( ndim2 = ndim*(ndim+1)/2 ) C DIMENSION array(np,6,6),array2(np) c DIMENSION vector(50), vector2(50) DIMENSION v1(159,30001),v2(159,30001),v3(159,30001) C COMMON /PARM/com1(99000) ,com2(0:8,0:8,99000) 1 ,com3(0:8),com4(0:8,0:8,0:8),nmax,mmax 1 ,com5(0:8,0:8) C COMMON /SET/ AX(0:4,-4:4,50),AY(0:4,-4:4,50),AZ(0:4,-4:4,50) 1 ,DD(0:4,-4:4,50), dd2(0:4,-4:4), nmax0(0:4,-4:4) C DIMENSION AH( ndim2 ),AF( ndim2 ),AF2(ndim2) DIMENSION E( ndim ),VEC( ndim,ndim) DIMENSION AH2(ndim,ndim),TEMP(ndim,ndim) dimension nbarray(6) C CHARACTER*1 PARI C write(6,*) ' ###### ##### ' 

I really have no idea why I am getting a segmentation fault all of a sudden. As far as I can tell, I am not even accessing any memory in the program yet (just allocating), so how can I get a segmentation fault?

Also, when piping the output of the program into a perlskript, I got a SIGPIPE for some reason, although it was the fortran program that crashed, not the perlskript.

Does anyone have any idea what might be happening here and how I can fix it?

I am running ubuntu 16.04 if that's relevant.

Edit: requested outputs are:

~$ ldd ./tetramer not a dynamic executable ~$ strace ./tetramer execve("./tetramer", ["./tetramer"], [/* 32 vars */]) = -1 ENOMEM (Cannot allocate memory) --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} --- +++ killed by SIGSEGV +++ Segmentation fault (core dumped) 

I also did some testing and it's always the fourth instance of the program that crashes with segmentation fault. I recently did a reinstall (I wiped some older ubuntu and installed 16.04), and it may be that under 16.04 I could only ever run three at a time and didn't notice. The times where I am absolutely sure that there were more than three instances where all before the reinstall.

I think it may have to do with the fact that the program tries to allocate 12gb of memory when the total memory plus swap is only 16gb, but with the parameters I am using right now it only really needs around 1gb (in the RES column), so I don't see why I can't run more than three instances.

9
  • Did you update any packages recently? Commented Jun 13, 2016 at 14:18
  • Any useful information from 'dmesg' command or in /var/log/syslog (or similar files)? You might see OOM ("out of memory") messages. Commented Jun 13, 2016 at 14:24
  • That's what I updated today: bsdutils grep libblkid1 libfdisk1 libmetacity-private3a libmount1 libsmartcols1 libuuid1 lshw metacity-common mount mtr-tiny thermald util-linux uuid-runtime I am not sure whether the problem occured before today (maybe it happened the first time sometime since Thursday, but I can't really check), and I also don't know how to check when and what I updated recently. Commented Jun 13, 2016 at 14:25
  • @StephenHarris: there's no OOM in dmesg, that's the only message today: do_general_protection: 33 callbacks suppressed (plus some messages about random time, whatever that is) Commented Jun 13, 2016 at 14:32
  • @StephenHarris: there's also nothing in /var/log/syslog that seems to be connected to this Commented Jun 13, 2016 at 14:35

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.