IBM pSeries manual

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

Go to page of

A good user manual

The rules should oblige the seller to give the purchaser an operating instrucion of IBM pSeries, along with an item. The lack of an instruction or false information given to customer shall constitute grounds to apply for a complaint because of nonconformity of goods with the contract. In accordance with the law, a customer can receive an instruction in non-paper form; lately graphic and electronic forms of the manuals, as well as instructional videos have been majorly used. A necessary precondition for this is the unmistakable, legible character of an instruction.

What is an instruction?

The term originates from the Latin word „instructio”, which means organizing. Therefore, in an instruction of IBM pSeries one could find a process description. An instruction's purpose is to teach, to ease the start-up and an item's use or performance of certain activities. An instruction is a compilation of information about an item/a service, it is a clue.

Unfortunately, only a few customers devote their time to read an instruction of IBM pSeries. A good user manual introduces us to a number of additional functionalities of the purchased item, and also helps us to avoid the formation of most of the defects.

What should a perfect user manual contain?

First and foremost, an user manual of IBM pSeries should contain:
- informations concerning technical data of IBM pSeries
- name of the manufacturer and a year of construction of the IBM pSeries item
- rules of operation, control and maintenance of the IBM pSeries item
- safety signs and mark certificates which confirm compatibility with appropriate standards

Why don't we read the manuals?

Usually it results from the lack of time and certainty about functionalities of purchased items. Unfortunately, networking and start-up of IBM pSeries alone are not enough. An instruction contains a number of clues concerning respective functionalities, safety rules, maintenance methods (what means should be used), eventual defects of IBM pSeries, and methods of problem resolution. Eventually, when one still can't find the answer to his problems, he will be directed to the IBM service. Lately animated manuals and instructional videos are quite popular among customers. These kinds of user manuals are effective; they assure that a customer will familiarize himself with the whole material, and won't skip complicated, technical information of IBM pSeries.

Why one should read the manuals?

It is mostly in the manuals where we will find the details concerning construction and possibility of the IBM pSeries item, and its use of respective accessory, as well as information concerning all the functions and facilities.

After a successful purchase of an item one should find a moment and get to know with every part of an instruction. Currently the manuals are carefully prearranged and translated, so they could be fully understood by its users. The manuals will serve as an informational aid.

Table of contents for the manual

  • Page 1

    IBM ~ pSeri es Hig h Perf o rm ance Sw it ch Tuni ng an d Debug Gui de Versi on 1.0 April 2005 IB M Systems and Technology Group Clus ter Perf ormance Department Poughk eepsi e, NY[...]

  • Page 2

    pshps t unin ggui d ewp0 401 05. doc P a ge 2 Content s 1.0 Introduction..................................................................................................... 4 2.0 Tunables a nd settin gs for switch softwa re ...................................................... 5 2.1 MP I tunab les for Pa ra llel Env ironm ent ....................[...]

  • Page 3

    pshps t unin ggui d ewp0 401 05. doc P a ge 3 5.10 MP_PRINTEN V ...................................................................................... 22 5.11 MP_STATIS TICS .................................................................................... 23 5.12 Dropped switch packets ............................................................[...]

  • Page 4

    pshps t unin ggui d ewp0 401 05. doc P a ge 4 1.0 I ntroduc tion This pape r is in tended to hel p you tune and de bu g the p e rf or man ce of the I BM ® pSe ries® High P erform a nce Switch (HPS) o n IBM Cluste r 1600 system s. It is n o t i ntended t o be a com pr ehen s ive gu ide , but rathe r to hel p in initi al tuning a nd debugging of pe[...]

  • Page 5

    pshps t unin ggui d ewp0 401 05. doc P a ge 5 2.0 Tunables and settings f or switc h software T o opt i miz e t h e HP S , you c a n s et sh el l va r i a b l es f or P ar a llel E nv ir o n men t M P I - b a s ed w or k l oa ds an d for I P-bas e d w ork lo ads . Th is se cti on rev ie w s th e s hel l v ariab le s th at are m o st o f ten used fo[...]

  • Page 6

    pshps t unin ggui d ewp0 401 05. doc P a ge 6 th read , and from wi t hin the MP I/LAPI pol ling code th a t is invoke d when the appl ic ation m akes blo cking MPI call s. MP_POLLING_IN T ERVA L sp e cifie s the num ber o f micro seconds an MPI/LA PI se r vice t hre ad sho uld wait ( slee p ) be fore i t check s whe ther any data prev iously sen t[...]

  • Page 7

    pshps t unin ggui d ewp0 401 05. doc P a ge 7 2.1. 5 MP_TASK_AFFINI TY Se tting MP_TA SK_A FFINI TY to SNI te lls parallel ope r ating envi r onmen t (POE) to bind each task to the MCM con taining the HPS adap te r it w ill use, so th at th e ad apte r , CPU, an d mem or y used by any task a re all lo cal to t he sam e MCM. To pre ven t m ulti ple [...]

  • Page 8

    pshps t unin ggui d ewp0 401 05. doc P a ge 8 So meti mes M P I-IO is u s ed in an a p plicat ion a s if it we r e ba s ic P OS IX read/writ e, eit h er becau s e the r e i s no nee d for mo re complex re a d/ write patte rns or be cause the a pplication w as pr evi ously h and-o ptimize d t o use POSIX re ad /w rite . In such cas es, i t is o ften[...]

  • Page 9

    pshps t unin ggui d ewp0 401 05. doc P a ge 9 rfifosize 0x1000000 receive fifo size False rpoolsize 0x02000000 IP receive pool size True spoolsize 0x02000000 IP send pool size True 3.0 Tunables and setti ngs for AIX 5L Seve r a l se ttings in AI X 5L im pact the pe rfo r m a nc e o f t he H PS. T h e se in clude the IP a nd mem o r y subsy stems. T[...]

  • Page 10

    pshps t unin ggui d ewp0 401 05. doc P a ge 10 The ove rhe a d in m aintain ing the file cac he can im pact t he p e rfo r mance of large paralle l app l i c atio n s. M uch o f t he ove r he ad i s ass o ciate d w i th the sync() syst em cal l (b y defau lt, run ev er y mi nu t e fr o m t h e sy n cd daemon). The sy n c() sy st em call sc ans al l[...]

  • Page 11

    pshps t unin ggui d ewp0 401 05. doc P a ge 11 3.3. 1 svmon Th e svm on com m a nd provide s in for m ation about the virtual me mo r y usage by the kernel and u s er p r oc es s es i n t h e s ys t e m a t a ny g i v e n t i m e. F or ex a mp l e, t o s e e s y s t e m - w i d e i n f or ma t i o n about th e segme n ts (256MB chunk of virtual m e[...]

  • Page 12

    pshps t unin ggui d ewp0 401 05. doc P a ge 12 Pa geSiz e Inu se Pin P gsp Vi rtu al 4KB 448221 3687 2675 449797 16 MB 0 0 0 0 Vs i d E sid T yp e De sc r ip ti on LP a ge Inu se P in P gsp Vi rtu a l 1f187f 11 w ork te xt data BSS he a p - 56789 0 0 56 789 218 a 2 70000000 w ork defaul t shm a t/ mm a p - 33680 0 0 33 680 131893 17 w o rk te xt d [...]

  • Page 13

    pshps t unin ggui d ewp0 401 05. doc P a ge 13 statisti cs i n 5-se cond in terv als, wi th t he f irst se t of statisti cs being the statisti cs si nc e the node o r LP AR wa s la s t b oot e d . vm sta t 5 The pi and po of the page group i s the number of 4KB pages re ad f rom and w r itten to the paging devic e b et ween co ns ecuti ve sa m pli [...]

  • Page 14

    pshps t unin ggui d ewp0 401 05. doc P a ge 14 adapte r is c onf ig ur ed . Th e vol ume of re serv ation i s proportion a l t o the number o f user w indow s con figured on the HPS a dapte r. A priv ate window i s required f o r each MP I task. He r e i s a f o rm ula t o c al c ul at e t he num be r o f TLPs nee de d by th e HPS ad apte r. I n th[...]

  • Page 15

    pshps t unin ggui d ewp0 401 05. doc P a ge 15 3.5 Large pages and IP support One o f the mo st important w ay s to im pr ove I P pe rfor mance on the HPS is to en sur e th a t large pages a re en abled . Lar ge pa ges are re quired to a llocate a n u m b e r of l arge page s which will used b y t h e H P S I P d r iv er a t b oot t i m e. Each sn [...]

  • Page 16

    pshps t unin ggui d ewp0 401 05. doc P a ge 16 If yo u have eigh t cards fo r p690 (o r four ca rds fo r p655 ), thi s com ma nd al so indic ates whe ther yo u have fu ll mem o r y bandwid t h. 3.8 Debug set tings i n the AIX 5L kernel The AI X 5L ke rnel has seve ra l de bug se tting s th at a ffect the performan ce of an appl ic ation. To m a ke [...]

  • Page 17

    pshps t unin ggui d ewp0 401 05. doc P a ge 17 4.2 LoadLev el er dae mons The LoadLevele r ® da emon s ar e ne eded f or MPI application s using HP S . Ho weve r , you can lowe r the im pact on a parallel a pplic ation by ch anging the de fault se tting s fo r the se d aemon s. You c an lowe r the im pact o f the L oad Levele r daemon s by: • Re[...]

  • Page 18

    pshps t unin ggui d ewp0 401 05. doc P a ge 18 SC HEDD_DEBUG = -D_A LWAYS 4.3 S ett i ngs for AIX 5L threads Seve ral v ar iable s hel p you use AIX 5L th r e a ds to tune pe r f orm ance . The se are the recomm ende d in itial se tting s fo r A IX 5L th re a ds wh en using HPS. Set th e m in the /et c/en vironm ent file . A I XTHREAD_SCOP E=S A IX[...]

  • Page 19

    pshps t unin ggui d ewp0 401 05. doc P a ge 19 5.0 Debug set tings and dat a collect ion tools S e v er a l d ebu g set t i n gs a nd da t a c ol l e c t i o n t o o ls c a n h el p you de b u g a p er f or ma nc e p r ob l e m o n sy st em s using HPS. Th i s se ction c on tains a sub se t of the mo st c omm on setting c hange s and t o ols . I f [...]

  • Page 20

    pshps t unin ggui d ewp0 401 05. doc P a ge 20 5.3 Affini t y L PARs On p690 sy stems, if you are runn ing wi th more th an one LPAR for e ach CEC, m a ke sure yo u ar e r unn i n g a ff i nit y LP AR s . T o c hec k a ff i n it y b et we e n C P U, me mor y, a nd HP S lin ks , r u n t he assoc iativ i t y scri pts o n t he LPA Rs. To ch e ck the m[...]

  • Page 21

    pshps t unin ggui d ewp0 401 05. doc P a ge 21 On the HMC GU I, selec t Se r vice A pplic ation s -> Se rvice Fo c al Poi n t - > Sele ct Se r vice a ble Even ts. 5.7 errpt command On AI X 5L, t he errpt c o mma n d list s a s u mmar y of s ys t e m er r or mes s a g es. S ome of t he H P S su bs y st e m e rro rs are co ll e c ted by e rrpt [...]

  • Page 22

    pshps t unin ggui d ewp0 401 05. doc P a ge 22 • Fo r HA L l i b rari e s: ds h su m /u sr /s ni /a ix 52 /li b /l ib ha l_ r. a • Fo r MP I l i bra ri e s: ds h su m /u sr /l pp /p pe .p oe/ l ib /l ib mp i_ r. a (o r run with MP_P RI NTEN V=ye s) T o mak e s u r e y ou a r e r unni n g t h e c or r ect c o mb i na t i o n of H AL, L AP I, a n[...]

  • Page 23

    pshps t unin ggui d ewp0 401 05. doc P a ge 23 MEMORY _A F FINI TY Single Thre ad Usage(MP_SINGLE_THREA D) Hin ts Fi l te red (MP_H IN TS_ FIL TE RED ) MP I-I/ O Buff er S ize (MP_IO_B UFFER _S IZE) MP I-I/O Err or Lo ggin g (MP _IO_ ER R LOG) MPI-I/ O Node File (MP_IO_ NODEF ILE) MPI-I/ O T ask List (MP _ IO_T ASKLIST ) Syst em Ch eckp o i nta b l[...]

  • Page 24

    pshps t unin ggui d ewp0 401 05. doc P a ge 24 MPCI: se nds = 14 MPCI: se nd sComple te = 14 MPCI: se nd Wai tsC omple t e = 17 MPCI: recv s = 17 MPCI: recv WaitsCom plete = 13 M PCI : e arl yA rri v al s = 5 MPCI: e a r lyA rr iv alsMatched = 5 MP C I: la t eArr i va ls = 8 MPCI: sh ove s = 10 MP C I: p ulls = 1 3 M PCI : thre ade d L ock Y ie ld [...]

  • Page 25

    pshps t unin ggui d ewp0 401 05. doc P a ge 25 Run the follow ing c omm a nd: /usr/sbi n /ifsn _dump - a T he dat a is coll ect ed in sn i .sn ap ( sn i _dum p .ou t .Z ) , and pro vide s usef ul in form ation , such as th e loca l mac a ddr ess: m a c_addr 0:0: 0 :40:0:0 If yo u a r e see ing arpq d rops, e n sure the source h a s the c orre ct m [...]

  • Page 26

    pshps t unin ggui d ewp0 401 05. doc P a ge 26 To he l p you i solate the e xact cause of packe t dr ops, the i f s n_dum p -a c o mma n d a ls o lis ts t h e follow i ng debug statistic s. I f y o u i solate packe t d rops to the s e statisti cs, you wi ll probably need to con tact IB M suppo rt. dbg: | sNet_drop 0x00000000 [0] | sRTF_drop 0x00000[...]

  • Page 27

    pshps t unin ggui d ewp0 401 05. doc P a ge 27 T h er e a r e t wo r ou t es . sending packe t using r out e No . 1 ml ip ad d r ess structu re , sta rting : ml fl ag (ml in t erface up o r down) = 0x00000000 m l ti ck = 0 m l ip add ress = 0xc 0a80203, 19 2.168.2 .3 T h er e a r e t w o p r ef er r ed r ou t e p a ir s : f rom loc al if 0 t o re m[...]

  • Page 28

    pshps t unin ggui d ewp0 401 05. doc P a ge 28 MA C WOF ( 2F870): B i t: 1 [. . .] 5.12.4 P ack ets d ropp ed in th e s w i tch h ardw are If a pa c k et is dr op p ed wit h i n t he s w it c h ha r d wa r e it s elf (f or ex a mp l e, wh en t r a ver s i n g th e l i nk b et w e e n t w o s w i t c h c hip s ) , e vi d e n c e o f t h e p a c k et[...]

  • Page 29

    pshps t unin ggui d ewp0 401 05. doc P a ge 29 5.14 LAPI _DEBUG_COMM_TIMEOUT If the L API proto col e xperience s c omm unication time outs, se t the envi ronme nt v ariable LA PI_DEBUG_C OMM_T IMEOUT to PAUS E . Thi s cause s the appl ic ation to issue a pause( ) ca ll w he n enc ou nt er i n g a t ime ou t , w hi c h s t op s t h e a pp li c a t [...]

  • Page 30

    pshps t unin ggui d ewp0 401 05. doc P a ge 30 5.16 AIX 5L trace for daemon activi ty If yo u suspect th a t a sy stem da emon is causi ng a pe r form a nce problem on yo ur sy st em , run AIX 5 L t r a ce t o c h ec k f or da em o n a c t i vit y. F or ex a mp l e, t o f i n d ou t w hic h da emo n s a r e ta kin g up CPU tim e , use the f ollowin[...]

  • Page 31

    pshps t unin ggui d ewp0 401 05. doc P a ge 31 7.2 MPI document ation P arallel En vironme nt f or AIX 5L V4 .1.1 Hitc hhike r's G uide, SA 22- 7947-01 P arallel Env ironmen t for A IX 5L V 4.1. 1 Ope ration and Use, V olume 1 , SA22-7948- 01 P arallel Env ironmen t for A IX 5L V 4.1. 1 Ope ration and Use, V olume 2 , SA 22-7949-01 P arallel E[...]

  • Page 32

    pshps t unin ggui d ewp0 401 05. doc P a ge 32 © IBM Cor poration 20 05 IBM Corporati on Marketing Com m unicati ons System s Gr oup Route 10 0 Somer s, New York 1 0589 Produced i n the Uni ted States of Amer ica April 2005 All Rights R eserved T his docum ent was developed for pro ducts and/or s ervices offered in the Uni ted Stat es. IBM ma y no[...]