Quantcast
Channel: UNIX and Linux Forums
Viewing all articles
Browse latest Browse all 16232

T4-1/Solaris 11 and sudden 50% performance drop

$
0
0
We have a SPARC T4-1 server, running Solaris 11, and it's doing some pretty extensive parsing on roughly 100GB data set.

All was well still few weeks ago, when I was testing the performance, I was reaching rougly 50minute calculation times, and it was more or less expected performance.
Now that I started moving the server to it's final location, when testing the tools, I noticed that performance has dropped dramatically. For some reason I can't parse the data set anymore under 2 hours.. I don't understand why.

I haven't done any modifications to the system, except some network configuration - and nobody else has touched the system to my knowledge.:wall:

Where should I start looking, why the parsing performance has dropped more than 50% ? Obviously it's quite HDD intensive, since our dataset is so large, but our zpool seems to be healthy, and without any errors.
I checked /var/adm/messages, and there were no errors either.

prstat doesn't show anything odd, only our parsing process (sed) is taking most of the time.
Code:

  PID USERNAME  SIZE  RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP     
  2096 username 9024K 1880K cpu55    0    0  2:26:36 1.6% sed/1
  2098 username 8960K 1808K cpu28    0    0  2:26:30 1.6% sed/1
  2097 username 9024K 1848K sleep  22    0  0:41:52 0.4% grep/1
  2095 username 9024K 1872K sleep  59    0  0:02:13 0.0% grep/1
  2099 username  132M  122M sleep  59    0  0:01:19 0.0% sort/1
  327 root        0K    0K sleep  99  -20  0:01:30 0.0% zpool-workarea/262
  621 root      702M  226M sleep  59    0  0:02:07 0.0% eptelemon/2
  539 root      17M  13M sleep  59    0  0:01:07 0.0% ldmd/16
    47 netcfg  4944K 2824K sleep  59    0  0:00:00 0.0% netcfgd/4
    81 daemon  7936K 5936K sleep  59    0  0:00:00 0.0% kcfd/3
  102 root    3136K 1128K sleep  59    0  0:00:00 0.0% in.mpathd/1
    44 root    5184K 4064K sleep  59    0  0:00:00 0.0% dlmgmtd/8
    74 netadm  5344K 3032K sleep  59    0  0:00:00 0.0% ipmgmtd/5
  116 root    2392K 1776K sleep  59    0  0:00:00 0.0% pfexecd/3
  793 root    3360K 1944K sleep  59    0  0:00:00 0.0% in.routed/1
 NPROC USERNAME  SWAP  RSS MEMORY      TIME  CPU                           
    11 username  244M  164M  0.5%  5:38:31 3.6%
    62 root    1047M  444M  1.3%  0:06:10 0.0%
    1 netcfg  4944K 2824K  0.0%  0:00:00 0.0%
    2 daemon    11M 6912K  0.0%  0:00:00 0.0%
    1 netadm  5344K 3032K  0.0%  0:00:00 0.0%
Total: 82 processes, 914 lwps, load averages: 2.63, 2.64, 2.64

I tried looking at iostat and vmstat, but I don't really understand them, any pointers on what I should look at ?
The output looks like this (during parsing), iostat:
Code:

  tty        sd1          sd2          sd3          sd4            cpu
 tin tout kps tps serv  kps tps serv  kps tps serv  kps tps serv  us sy wt id
  1  77 1664  14    4  1669  14    4  123  17  14  536  6    7    2  1  0 98
  0    0 3355  26    3  3303  26    3    0  0    0  1119  23    3    4  1  0 96
  0    0 3343  34    3  3414  31    3    0  0    0    0  0    0    4  1  0 96
  0    0 3328  26    2  3226  25    4    0  0    0    0  0    0    4  1  0 96
  0    0 2790  22    2  2893  23    3    0  0    0    0  0    0    4  1  0 96

vmstat:
Code:

kthr      memory            page            disk          faults      cpu
 r b w  swap  free  re  mf pi po fr de sr s1 s2 s3 s4  in  sy  cs us sy id
 0 0 0 14679712 16787992 7 16 0 0  0  0  8 14 14 17  6 2114 3539 2189  2  1 98
 0 0 0 1991952 4236944 2  6  0  0  0  0  0 27 26  0  0 1988 4960 2104  4  1 96
 0 0 0 1957032 4202024 0  1  0  0  0  0  0 26 26  0 20 2029 4929 2157  4  1 96
 0 0 0 1923640 4168632 0  1  0  0  0  0  0 32 35  0  0 2029 4940 2163  4  1 96
 0 0 0 1893096 4138088 0  0  0  0  0  0  0 31 32  0  0 2001 4889 2115  4  1 96

I know that I should change our CPU threading mode to max-ipc, as it's currently throughput, but this hasn't changed since last time I run tests.

Thanks for any tips!

Viewing all articles
Browse latest Browse all 16232

Trending Articles