Tag Archives: Oracle Enterprise Linux

AVDF missing boot partition

While working on the problem with missing RAM on the AVDF test system (see ) I realized, that the linux boot partition is not available by default.

[root@melete2 log]# ls -al /boot
total 16
drwxr-xr-x  2 root root 4096 Jan 11  2013 .
drwxr-xr-x 24 root root 4096 Jul 11 20:19 ..

[root@melete2 log]# df -kh /boot
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_root-lv_root
                      6.6G  2.2G  4.1G  35% /

Initially I was a bit confused since it contains stuff like grub configuration, inited.img, kernel etc. All stuff that are needed for system boot. Ok, I have not thought about that for the bootloader, the file system does not have to be mounted. From the security point of view it’s even better to not have it mounted. If not mounted nobody can accidentally change something. 😉 Oracle has defined noauto for the boot partition. Therefore the device is not mounted automatically during system boot.

[root@melete2 log]# cat /etc/fstab|grep boot
LABEL=/boot                    /boot                    ext3   noatime,noauto,nodev,nosuid                  1 2

If you need to change the grub configuration just mount the boot partition manually.

[root@melete2 log]# mount /boot

[root@melete2 log]# vi /boot/grub/grub.conf 

[root@melete2 ~]# umount /boot

AVDF Linux kernel could not recognize whole RAM

After initial setup of an Audit Vault and Database Firewall engineering system, I’ve started to add several audit vault agents and secure targets. In the beginning it went quite smoothly. But after a certain number of secured targets, there were continuously ORA-04031 errors. Most of the errors were related to large pool and PX Msg buffers issues. The analysis of the trace files has shown interesting stuff. 😉 But more on that in a later blog post. The real problem is the available memory.

Symptoms

The Audit Vault and Database Firewall engineering system is running on a HP ProLiant BL465c Gen 8. It comes with 32GB Memory. Should actually be sufficient for a system engineering. It turned out that the 32GB are not recognized by operating system. As you can see below the system has just 3GB memory in total.

[root@melete2 ~]# free
                     total    used   free shared buffers  cached
Mem:               3048108 2385888 662220      0   10720 1525036
-/+ buffers/cache:  850132 2197976
Swap:              4194296  453564 3740732

Reviewing dmesg shows that we lose 29 GB of memory.

Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.32-300.39.5.el5uek (mockbuild@ca-build56.us.oracle.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-50)) #1 SMP Wed Mar 13 11:26:53 PDT 2013
Command line: ro root=/dev/vg_root/lv_root console=tty9 udevtimeout=10
KERNEL supported cpus:
  Intel GenuineIntel
  AMD AuthenticAMD
  Centaur CentaurHauls
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009f000 (usable)
 BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000bddde000 (usable)
 BIOS-e820: 00000000bddde000 - 00000000bde0e000 (ACPI data)
 BIOS-e820: 00000000bde0e000 - 00000000d0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fee10000 (reserved)
 BIOS-e820: 00000000ff800000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 000000083efff000 (usable)
DMI 2.7 present.
last_pfn = 0x83efff max_arch_pfn = 0x400000000
MTRR default type: uncachable
MTRR fixed ranges enabled:
  00000-9FFFF write-back
  A0000-BFFFF uncachable
  C0000-FFFFF write-back
MTRR variable ranges enabled:
  0 base 000000000000 mask FFFF80000000 write-back
  1 base 000080000000 mask FFFFC0000000 write-back
  2 disabled
  3 disabled
  4 disabled
  5 disabled
  6 disabled
  7 disabled
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
e820 update range: 00000000c0000000 - 000000083efff000 (usable) ==> (reserved)
WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 29679MB of RAM.
------------[ cut here ]------------

Cause

According to an Oracle Metalink Note 1448147.1 this problem is related to a BIOS issue.

Solutions and Workaround

The solution described in Oracle Metalink Note 1448147.1 is to upgrade the BIOS or disable MTRR in kernel. Since BIOS upgrade is not an option for this environment I’ll try to workaround by disable MTRR.

Disable MTRR

Changing the grub.conf is basically quite easy if you find the boot files. When I first try it, I’d realized that there is no grub configuration available. It seems that Oracle decided to not mount /boot at startup. So it is mandatory to first mount the boot partition. Afterward you just can add disable_mtrr_trim as additional kernel option.

[root@melete2 ~]# mount /boot

[root@melete2 ~]# df -kh /boot
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             145M   26M  112M  19% /boot

[root@melete2 ~]# vi /boot/grub/grub.conf 
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /boot/, eg.
#          root (hd0,0)
#          kernel /vmlinuz-version ro root=/dev/vg_root/lv_root
#          initrd /initrd-version.img
#boot=/dev/sda
default=0
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title Audit Vault Server 12.1.1.0.0
        root (hd0,0)
        kernel /vmlinuz-2.6.32-300.39.5.el5uek ro root=/dev/vg_root/lv_root console=tty9 
udevtimeout=10 disable_mtrr_trim
        initrd /initrd-2.6.32-300.39.5.el5uek.img
title Audit Vault Server 12.1.1.0.0
        root (hd0,0)
        kernel /vmlinuz-2.6.32-300.38.1.el5uek ro root=/dev/vg_root/lv_root console=tty9 
udevtimeout=10 disable_mtrr_trim
        initrd /initrd-2.6.32-300.38.1.el5uek.img

[root@melete2 ~]# reboot

Broadcast message from root (pts/0) (Thu Jul 11 20:17:56 2013):

The system is going down for reboot NOW!
[root@melete2 ~]# Connection to melete2 closed by remote host.
Connection to melete2 closed.

After reboot we now have 32GB memory available.

[root@melete2 ~]# free
                      total     used     free shared buffers  cached
Mem:               33024372  3930724 29093648      0   17868 2640744
-/+ buffers/cache:  1272112 31752260
Swap:              14680056        0 14680056

Unfortunately, the configuration of the AVDF appliance is not automatically updated to use the extra memory. We have to do some manual changes.

Update Kernel Parameters

The kernel setting have to be changed to allow a bigger SGA. See Metalink Note 1529433.1 for more detailed information on how calculate and set the kernel parameters. For the engineering system we will define a SGA with 20GB therefor we set the shmmax and shmall as follows:

[root@melete2 ~]# vi /etc/sysctl.conf
…
kernel.shmmax=23622320128
kernel.shmall=5368709120
...
[root@melete2 ~]# sysctl -p

Increase SWAP

With 32GB memory, it is also advisable to enlarge the swap space. I’ve discussed this already in the blog post Resize swap space on linux. Since the AVDF appliance does use logical volumes it’s even a bit easier.

[root@melete2 ~]# swapoff -v /dev/vg_root/lv_swap

[root@melete2 ~]# lvresize /dev/vg_root/lv_swap -L +8G

[root@melete2 ~]# mkswap /dev/vg_root/lv_swap

[root@melete2 ~]# swapon -v /dev/vg_root/lv_swap

Increase SGA

Finally we can increase the SGA.


SQL> alter system set sga_max_size=20G scope=spfile;
System altered.

SQL> alter system set sga_target=20G scope=spfile;
System altered.

SQL> startup force

Conclusion

Although AVDF is an appliance, it is mandatory to examine the system after installation. Eg. are there errors in the log files in /var/log, memory, storage etc. available. The solution described here makes it possible to use all the memory. Nevertheless, the appliance has been adjusted to an extent where is necessary to consider whether the support is still archive. If you run into a similar issue on your production AVDF setup I would recommend opening an Oracle SR. Looking forward to the next AVDF patchset. I hope this system stays patchable.

References

Some links related to this post.

  • Linux kernel could not recognize whole RAM [1448147.1]
  • Upon startup of Linux database get ORA-27102: out of memory Linux-X86_64 Error: 28: No space left on device[301830.1]
  • Requirements for Installing Oracle Database 12.1 on RHEL5 or OL5 64-bit (x86-64) [1529433.1]
  • Requirements for Installing Oracle 11gR2 RDBMS on RHEL (and OEL) 5 on AMD64/EM64T [880989.1]
  • Master Note of Linux OS Requirements for Database Server [851598.1]

Resize swap space on linux

A few times a year I create a new linux VM. I usually do this by using a kickstart server. The kickstart configuration file I normally use is creating a swap partition which is to small for an Oracle database server. Unfortunately, I forget regularly how to resize the swap partition. Ok, I could update my kickstart configuration file before I create the VM, but this gets forgotten as well 😉

Background

I try to limit the size of my VM as good as possible. Disk space on a SSD disk is not yet as cheep as it should be. Therefore I usually create VM disk which can grow to certain limit. For the swap disk I use a 4GB VM disk and define a swap space of about 2G. The VM disk itself will not grow as long as there is not a lot of swapping. But if the VM has at least 2GB memory the Oracle installer is complaining about to low swap space. Ok, you can ignore this 😉 or you can increase the swap space.

Let’s do it

Check the current settings

cat /etc/fstab 
LABEL=/                 /                       ext3    defaults        1 1
LABEL=/u00              /u00                    ext3    defaults        1 2
LABEL=/u01              /u01                    ext3    defaults        1 2
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0
LABEL=SWAP-sdb1         swap                    swap    defaults        0 0

Switch off the swap device

swapoff -a

Recreate the swap partition with frisk

fdisk /dev/sdb


Command (m for help): m
Command action
   a   toggle a bootable flag
   b   edit bsd disklabel
   c   toggle the dos compatibility flag
   d   delete a partition
   l   list known partition types
   m   print this menu
   n   add a new partition
   o   create a new empty DOS partition table
   p   print the partition table
   q   quit without saving changes
   s   create a new empty Sun disklabel
   t   change a partition's system id
   u   change display/entry units
   v   verify the partition table
   w   write table to disk and exit
   x   extra functionality (experts only)

Delete the old swap partition

Command (m for help): d
Selected partition 1

Command (m for help): d
No partition is defined yet!

Select the partition type

Command (m for help): t
Selected partition 1
Hex code (type L to list codes): l

 0  Empty           1e  Hidden W95 FAT1 80  Old Minix       bf  Solaris        
 1  FAT12           24  NEC DOS         81  Minix / old Lin c1  DRDOS/sec (FAT-
 2  XENIX root      39  Plan 9          82  Linux swap / So c4  DRDOS/sec (FAT-
 3  XENIX usr       3c  PartitionMagic  83  Linux           c6  DRDOS/sec (FAT-
 4  FAT16 <32M      40  Venix 80286     84  OS/2 hidden C:  c7  Syrinx         
 5  Extended        41  PPC PReP Boot   85  Linux extended  da  Non-FS data    
 6  FAT16           42  SFS             86  NTFS volume set db  CP/M / CTOS / .
 7  HPFS/NTFS       4d  QNX4.x          87  NTFS volume set de  Dell Utility   
 8  AIX             4e  QNX4.x 2nd part 88  Linux plaintext df  BootIt         
 9  AIX bootable    4f  QNX4.x 3rd part 8e  Linux LVM       e1  DOS access     
 a  OS/2 Boot Manag 50  OnTrack DM      93  Amoeba          e3  DOS R/O        
 b  W95 FAT32       51  OnTrack DM6 Aux 94  Amoeba BBT      e4  SpeedStor      
 c  W95 FAT32 (LBA) 52  CP/M            9f  BSD/OS          eb  BeOS fs        
 e  W95 FAT16 (LBA) 53  OnTrack DM6 Aux a0  IBM Thinkpad hi ee  EFI GPT        
 f  W95 Ext'd (LBA) 54  OnTrackDM6      a5  FreeBSD         ef  EFI (FAT-12/16/
10  OPUS            55  EZ-Drive        a6  OpenBSD         f0  Linux/PA-RISC b
11  Hidden FAT12    56  Golden Bow      a7  NeXTSTEP        f1  SpeedStor      
12  Compaq diagnost 5c  Priam Edisk     a8  Darwin UFS      f4  SpeedStor      
14  Hidden FAT16 3 61  SpeedStor       a9  NetBSD          f2  DOS secondary  
16  Hidden FAT16    63  GNU HURD or Sys ab  Darwin boot     fb  VMware VMFS    
17  Hidden HPFS/NTF 64  Novell Netware  b7  BSDI fs         fc  VMware VMKCORE 
18  AST SmartSleep  65  Novell Netware  b8  BSDI swap       fd  Linux raid auto
1b  Hidden W95 FAT3 70  DiskSecure Mult bb  Boot Wizard hid fe  LANstep        
1c  Hidden W95 FAT3 75  PC/IX           be  Solaris boot    ff  BBT            
Hex code (type L to list codes): 82
Changed system type of partition 1 to 82 (Linux swap / Solaris)

Create a new partition. I’ll use the full size of the disk /dev/sdb

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-522, default 1): 
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-522, default 522): 
Using default value 522

Write the changes to disk and exit

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

Now it’s time to create a new swap filesystem with mkswap. Because I use labels in fstab, I create the new filesystem again with a label.

mkswap /dev/sdb1 -L SWAP-sdb1

Enable the swap device again

swapon -a

Display the new swap info

swapon -s
Filename                                Type            Size    Used    Priority
/dev/sdb1                               partition       4192924 34324   -1