After initial setup of an Audit Vault and Database Firewall engineering system, I’ve started to add several audit vault agents and secure targets. In the beginning it went quite smoothly. But after a certain number of secured targets, there were continuously ORA-04031 errors. Most of the errors were related to large pool and PX Msg buffers issues. The analysis of the trace files has shown interesting stuff. 😉 But more on that in a later blog post. The real problem is the available memory.
Symptoms
The Audit Vault and Database Firewall engineering system is running on a HP ProLiant BL465c Gen 8. It comes with 32GB Memory. Should actually be sufficient for a system engineering. It turned out that the 32GB are not recognized by operating system. As you can see below the system has just 3GB memory in total.
[root@melete2 ~]# free total used free shared buffers cached Mem: 3048108 2385888 662220 0 10720 1525036 -/+ buffers/cache: 850132 2197976 Swap: 4194296 453564 3740732
Reviewing dmesg
shows that we lose 29 GB of memory.
Initializing cgroup subsys cpuset Initializing cgroup subsys cpu Linux version 2.6.32-300.39.5.el5uek (mockbuild@ca-build56.us.oracle.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-50)) #1 SMP Wed Mar 13 11:26:53 PDT 2013 Command line: ro root=/dev/vg_root/lv_root console=tty9 udevtimeout=10 KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD Centaur CentaurHauls BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009f000 (usable) BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 00000000bddde000 (usable) BIOS-e820: 00000000bddde000 - 00000000bde0e000 (ACPI data) BIOS-e820: 00000000bde0e000 - 00000000d0000000 (reserved) BIOS-e820: 00000000fec00000 - 00000000fee10000 (reserved) BIOS-e820: 00000000ff800000 - 0000000100000000 (reserved) BIOS-e820: 0000000100000000 - 000000083efff000 (usable) DMI 2.7 present. last_pfn = 0x83efff max_arch_pfn = 0x400000000 MTRR default type: uncachable MTRR fixed ranges enabled: 00000-9FFFF write-back A0000-BFFFF uncachable C0000-FFFFF write-back MTRR variable ranges enabled: 0 base 000000000000 mask FFFF80000000 write-back 1 base 000080000000 mask FFFFC0000000 write-back 2 disabled 3 disabled 4 disabled 5 disabled 6 disabled 7 disabled x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106 e820 update range: 00000000c0000000 - 000000083efff000 (usable) ==> (reserved) WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 29679MB of RAM. ------------[ cut here ]------------
Cause
According to an Oracle Metalink Note 1448147.1 this problem is related to a BIOS issue.
Solutions and Workaround
The solution described in Oracle Metalink Note 1448147.1 is to upgrade the BIOS or disable MTRR in kernel. Since BIOS upgrade is not an option for this environment I’ll try to workaround by disable MTRR.
Disable MTRR
Changing the grub.conf
is basically quite easy if you find the boot files. When I first try it, I’d realized that there is no grub configuration available. It seems that Oracle decided to not mount /boot
at startup. So it is mandatory to first mount the boot partition. Afterward you just can add disable_mtrr_trim
as additional kernel option.
[root@melete2 ~]# mount /boot [root@melete2 ~]# df -kh /boot Filesystem Size Used Avail Use% Mounted on /dev/sda1 145M 26M 112M 19% /boot [root@melete2 ~]# vi /boot/grub/grub.conf # grub.conf generated by anaconda # # Note that you do not have to rerun grub after making changes to this file # NOTICE: You have a /boot partition. This means that # all kernel and initrd paths are relative to /boot/, eg. # root (hd0,0) # kernel /vmlinuz-version ro root=/dev/vg_root/lv_root # initrd /initrd-version.img #boot=/dev/sda default=0 timeout=5 splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title Audit Vault Server 12.1.1.0.0 root (hd0,0) kernel /vmlinuz-2.6.32-300.39.5.el5uek ro root=/dev/vg_root/lv_root console=tty9 udevtimeout=10 disable_mtrr_trim initrd /initrd-2.6.32-300.39.5.el5uek.img title Audit Vault Server 12.1.1.0.0 root (hd0,0) kernel /vmlinuz-2.6.32-300.38.1.el5uek ro root=/dev/vg_root/lv_root console=tty9 udevtimeout=10 disable_mtrr_trim initrd /initrd-2.6.32-300.38.1.el5uek.img [root@melete2 ~]# reboot Broadcast message from root (pts/0) (Thu Jul 11 20:17:56 2013): The system is going down for reboot NOW! [root@melete2 ~]# Connection to melete2 closed by remote host. Connection to melete2 closed.
After reboot we now have 32GB memory available.
[root@melete2 ~]# free total used free shared buffers cached Mem: 33024372 3930724 29093648 0 17868 2640744 -/+ buffers/cache: 1272112 31752260 Swap: 14680056 0 14680056
Unfortunately, the configuration of the AVDF appliance is not automatically updated to use the extra memory. We have to do some manual changes.
Update Kernel Parameters
The kernel setting have to be changed to allow a bigger SGA. See Metalink Note 1529433.1 for more detailed information on how calculate and set the kernel parameters. For the engineering system we will define a SGA with 20GB therefor we set the shmmax and shmall as follows:
[root@melete2 ~]# vi /etc/sysctl.conf … kernel.shmmax=23622320128 kernel.shmall=5368709120 ... [root@melete2 ~]# sysctl -p
Increase SWAP
With 32GB memory, it is also advisable to enlarge the swap space. I’ve discussed this already in the blog post Resize swap space on linux. Since the AVDF appliance does use logical volumes it’s even a bit easier.
[root@melete2 ~]# swapoff -v /dev/vg_root/lv_swap [root@melete2 ~]# lvresize /dev/vg_root/lv_swap -L +8G [root@melete2 ~]# mkswap /dev/vg_root/lv_swap [root@melete2 ~]# swapon -v /dev/vg_root/lv_swap
Increase SGA
Finally we can increase the SGA.
SQL> alter system set sga_max_size=20G scope=spfile; System altered. SQL> alter system set sga_target=20G scope=spfile; System altered. SQL> startup force
Conclusion
Although AVDF is an appliance, it is mandatory to examine the system after installation. Eg. are there errors in the log files in /var/log
, memory, storage etc. available. The solution described here makes it possible to use all the memory. Nevertheless, the appliance has been adjusted to an extent where is necessary to consider whether the support is still archive. If you run into a similar issue on your production AVDF setup I would recommend opening an Oracle SR. Looking forward to the next AVDF patchset. I hope this system stays patchable.
References
Some links related to this post.
- Linux kernel could not recognize whole RAM [1448147.1]
- Upon startup of Linux database get ORA-27102: out of memory Linux-X86_64 Error: 28: No space left on device[301830.1]
- Requirements for Installing Oracle Database 12.1 on RHEL5 or OL5 64-bit (x86-64) [1529433.1]
- Requirements for Installing Oracle 11gR2 RDBMS on RHEL (and OEL) 5 on AMD64/EM64T [880989.1]
- Master Note of Linux OS Requirements for Database Server [851598.1]
Pingback: AVDF missing boot partition | OraDBA
This note was helpful especially with the link to the metalink note. Appreciate you sharing this information.
Thanks for the feedback. I’m glad the post was useful.
Hello Stefan, This is good post. We are installting the AVDF but its failing silently….would you know where would be the default log files would be created for errors? /var/log?
Hi
I’m not sure but I guess it would be on the avserver in av/log respectively in av/log on the client. At the moment I do not have access to a new installation I’ll try to look into it as soon as I do my next agent installation.
Cheers
Stefan