UKOUG Conference 2007 Luca Canali, CERN IT A Closer Look inside Oracle ASM.

Click here to load reader

download UKOUG Conference 2007 Luca Canali, CERN IT A Closer Look inside Oracle ASM.

of 40

Transcript of UKOUG Conference 2007 Luca Canali, CERN IT A Closer Look inside Oracle ASM.

  • Slide 1

UKOUG Conference 2007 Luca Canali, CERN IT A Closer Look inside Oracle ASM Slide 2 Outline Oracle ASM for DBAs Introduction and motivations ASM is not a black box Investigation of ASM internals Focus on practical methods and troubleshooting ASM and VLDB Metadata, rebalancing and performance Lessons learned from CERNs production DB services Slide 3 ASM Oracle Automatic Storage Management Provides the functionality of a volume manager and filesystem for Oracle (DB) files Works with RAC Oracle 10g feature aimed at simplifying storage management Together with Oracle Managed Files and the Flash Recovery Area An implementation of S.A.M.E. methodology Goal of increasing performance and reducing cost Slide 4 ASM for a Clustered Architecture Oracle architecture of redundant low-cost components Servers SAN Storage Slide 5 ASM Disk Groups Example: HW = 4 disk arrays with 8 disks each An ASM diskgroup is created using all available disks The end result is similar to a file system on RAID 1+0 ASM allows to mirror across storage arrays Oracle RDBMS processes directly access the storage RAW disk access Mirroring Striping Failgroup1 Failgroup2 ASM Diskgroup Slide 6 Files, Extents, and Failure Groups Files and extent pointers Failgroups and ASM mirroring Slide 7 ASM Is not a Black Box ASM is implemented as an Oracle instance Familiar operations for the DBA Configured with SQL commands Info in V$ views Logs in udump and bdump Some secret details hidden in X$TABLES and underscore parameters Slide 8 Selected V$ Views and X$ Tables View NameX$ TableDescription V$ASM_DISKGROUPX$KFGRPperforms disk discovery and lists diskgroups V$ASM_DISKX$KFDSK, X$KFKIDperforms disk discovery, lists disks and their usage metrics V$ASM_FILEX$KFFILlists ASM files, including metadata V$ASM_ALIASX$KFALSlists ASM aliases, files and directories V$ASM_TEMPLATEX$KFTMTAASM templates and their properties V$ASM_CLIENTX$KFNCLlists DB instances connected to ASM V$ASM_OPERATIONX$KFGMGlists current rebalancing operations N.A.X$KFKLIBavailable libraries, includes asmlib N.A.X$KFDPARTNERlists disk-to-partner relationships N.A.X$KFFXPextent map table for all ASM files N.A.X$KFDATallocation table for all ASM disks Slide 9 ASM Parameters Notable ASM instance parameters: *.asm_diskgroups='TEST1_DATADG1','TEST1_RECODG1' *.asm_diskstring='/dev/mpath/itstor*p*' *.asm_power_limit=5 *.shared_pool_size=70M *.db_cache_size=50M *.large_pool_size=50M *.processes=100 Slide 10 More ASM Parameters Underscore parameters Several undocumented parameters Typically dont need tuning Exception: _asm_ausize and _asm_stripesize May need tuning for VLDB in 10g New in 11g, diskgroup attributes V$ASM_ATTRIBUTE, most notable disk_repair_time au_size X$KFENV shows underscore attributes Slide 11 ASM Storage Internals ASM Disks are divided in Allocation Units (AU) Default size 1 MB (_asm_ausize) Tunable diskgroup attribute in 11g ASM files are built as a series of extents Extents are mapped to AUs using a file extent map When using normal redundancy, 2 mirrored extents are allocated, each on a different failgroup RDBMS read operations access only the primary extent of a mirrored couple (unless there is an IO error) In 10g the ASM extent size = AU size Slide 12 ASM Metadata Walkthrough Three examples follow of how to read data directly from ASM. Motivations: Build confidence in the technology, i.e. get a feeling of how ASM works It may turn out useful one day to troubleshoot a production issue. Slide 13 Example 1: Direct File Access 1/2 Goal: Reading ASM files with OS tools, using metadata information from X$ tables 1. Example: find the 2 mirrored extents of the RDBMS spfile 2. sys@+ASM1> select GROUP_KFFXP Group#, DISK_KFFXP Disk#, AU_KFFXP AU#, XNUM_KFFXP Extent# from X$KFFXP where number_kffxp=(select file_number from v$asm_alias where name='spfiletest1.ora'); GROUP# DISK# AU# EXTENT# ---------- ---------- 1 16 17528 0 1 4 14838 0 Slide 14 Example 1: Direct File Access 2/2 3. Find the disk path sys@+ASM1> select disk_number,path from v$asm_disk where GROUP_NUMBER=1 and disk_number in (16,4); DISK_NUMBER PATH ----------- ------------------------------------ 4 /dev/mpath/itstor417_1p1 16 /dev/mpath/itstor419_6p1 4. Read data from disk using dd dd if=/dev/mpath/itstor419_6p1 bs=1024k count=1 skip=17528 |strings Slide 15 X$KFFXP Column NameDescription NUMBER_KFFXPASM file number. Join with v$asm_file and v$asm_alias COMPOUND_KFFXPFile identifier. Join with compound_index in v$asm_file INCARN_KFFXPFile incarnation id. Join with incarnation in v$asm_file XNUM_KFFXPASM file extent number (mirrored extent pairs have the same extent value) PXN_KFFXPProgressive file extent number GROUP_KFFXPASM disk group number. Join with v$asm_disk and v$asm_diskgroup DISK_KFFXPASM disk number. Join with v$asm_disk AU_KFFXPRelative position of the allocation unit from the beginning of the disk. LXN_KFFXP0->primary extent,1->mirror extent, 2->2nd mirror copy (high redundancy and metadata) Slide 16 Example 2: A Different Way A different metadata table to reach the same goal of reading ASM files directly from OS: sys@+ASM1> select GROUP_KFDAT Group#,NUMBER_KFDAT Disk#, AUNUM_KFDAT AU# from X$KFDAT where fnum_kfdat=(select file_number from v$asm_alias where name='spfiletest1.ora'); GROUP# DISK# AU# ---------- ---------- ---------- 1 4 14838 1 16 17528 Slide 17 X$KFDAT Column Name (subset)Description GROUP_KFDATDiskgroup number, join with v$asm_diskgroup NUMBER_KFDATDisk number, join with v$asm_disk COMPOUND_KFDATDisk compund_index, join with v$asm_disk AUNUM_KFDATDisk allocation unit (relative position from the beginning of the disk), join with x$kffxp.au_kffxp V_KFDATFlag: V=this Allocation Unit is used; F=AU is free FNUM_KFDATFile number, join with v$asm_file XNUM_KFDATProgressive file extent number join with x$kffxp.pxn_kffxp Slide 18 Example 3: Yet Another Way Using the internal package dbms_diskgroup declare fileType varchar2(50); fileName varchar2(50); fileSz number; blkSz number; hdl number; plkSz number; data_buf raw(4096); begin fileName := '+TEST1_DATADG1/TEST1/spfiletest1.ora'; dbms_diskgroup.getfileattr(fileName,fileType,fileSz, blkSz); dbms_diskgroup.open(fileName,'r',fileType,blkSz, hdl,plkSz, fileSz); dbms_diskgroup.read(hdl,1,blkSz,data_buf); dbms_output.put_line(data_buf); end; / Slide 19 DBMS_DISKGROUP Can be used to read/write ASM files directly Its an Oracle internal package Does not require a RDBMS instance 11gs asmcmd cp command uses dbms_diskgroup Procedure NameParameters dbms_diskgroup.open (:fileName, :openMode, :fileType, :blkSz, :hdl,:plkSz, :fileSz) dbms_diskgroup.read (:hdl, :offset, :blkSz, :data_buf) dbms_diskgroup.createfile (:fileName, :fileType, :blkSz, :fileSz, :hdl, :plkSz, :fileGenName) dbms_diskgroup.close (:hdl) dbms_diskgroup.commitfile (:handle) dbms_diskgroup.resizefile (:handle,:fsz) dbms_diskgroup.remap (:gnum, :fnum, :virt_extent_num) dbms_diskgroup.getfileattr (:fileName, :fileType, :fileSz, :blkSz) Slide 20 File Transfer Between OS and ASM The supported tools (10g) RMAN DBMS_FILE_TRANSFER FTP (XDB) WebDAV (XDB) They all require a RDBMS instance In 11g, all the above plus asmcmd cp command Works directly with the ASM instance Slide 21 Strace and ASM 1/3 Goal: understand strace output when using ASM storage Example: read64(15,"#33\0@\"..., 8192, 473128960)=8192 This is a read operation of 8KB from FD 15 at offset 473128960 What is the segment name, type, file# and block# ? Slide 22 Strace and ASM 2/3 1. From /proc/ /fd I find that FD=15 is /dev/mpath/itstor420_1p1 2. This is disk 20 of D.G.=1 (from v$asm_disk) 3. From x$kffxp I find the ASM file# and extent#: Note: offset 473128960 = 451 MB + 27 *8KB sys@+ASM1>select number_kffxp, xnum_kffxp from x$kffxp where group_kffxp=1 and disk_kffxp=20 and au_kffxp=451; NUMBER_KFFXP XNUM_KFFXP ------------ ---------- 268 17 Slide 23 Strace and ASM 3/3 4. From v$asm_alias I find the file alias for file 268: USERS.268.612033477 5. From v$datafile view I find the RDBMS file#: 9 6. From dba extents finally find the owner and segment name relative to the original IO operation: sys@TEST1>select owner,segment_name,segment_type from dba_extents where FILE_ID=9 and 27+17*1024*1024 between block_id and block_id+blocks; OWNER SEGMENT_NAME SEGMENT_TYPE ----- ------------ ------------ SCOTT EMP TABLE Slide 24 Investigation of Fine Striping An application: finding the layout of fine-striped files Explored using strace of an oracle session executing alter system dump logfile.. Result: round robin distribution over 8 x 1MB extents Fine striping size = 128KB (1MB/8) A0 B0 A1 B1 A2 B2 A3 B3 A4 B4 A5 B5 A6 B6 A7 B7 AU = 1MB Slide 25 Metadata Files ASM diskgroups contain hidden files Not listed in V$ASM_FILE (file# Read Performance, Sequential I/O Limited by HBAs -> 4 x 2 Gb (measured with parallel query) Slide 39 Implementation Details Multipathing Linux Device Mapper (2.6 kernel) Block devices RHEL4 and 10gR2 allow to skip raw devices mapping External half of the disk for data disk groups JBOD config No HW RAID ASM used to mirror across disk arrays HW: Storage arrays (Infortrend): FC controller, SATA disks FC (Qlogic): 4Gb switch and HBAs (2Gb in older HW) Servers are 2x CPUs, 4GB RAM, 10.2.0.3 on RHEL4, RAC of 4 to 8 nodes Slide 40 Q&A Links: http://cern.ch/phydb http://twiki.cern.ch/twiki/bin/view/PSSGroup/ASM_Internals http://www.cern.ch/canali