PROBLEM:
Physical Memory Usage for a SNMP device is reporting incorrectly or is greater than 100%
SOLUTION:
As many of our users have noticed, the "Physical Memory Usage" test on a Linux (monitored using NET-SNMP agent) device often returns a high value like 98%. This is due to the fact that the Linux kernel reports available memory that includes I/O cache and buffer. The cached memory is released when an application needs it. While this provides improved I/O performance (ever wonder why the second 'find' command is much faster than the first one :-) it can create confusion and false alarms in Traverse.
Depending on the enterprise of the appliance, proprietary metrics may be available that report the real memory usage as a single metric. For example, Check Point appliances would encounter this issue if the RFC level (.1.3.6.1.2.1) memory usage is monitored. To get around this, Check Point offers the metric memFreeReal64 (1.3.6.1.4.1.2620.1.6.7.4.5). This is included in our Check Point firewall signature.
Fortunately, you can use the Composite Monitor to work around this issue if no proprietary metric is available. Essentially, real memory usage can be calculated by removing the buffer and cache memory.
I.e.
real memory usage = (used-cache-buffer)/total
Here are the steps that you will need to follow to create a new test the reflects the real memory utilization:
Step 1: This article applies to Traverse 9.5.022 and higher. For earlier releases, please consider upgrading before attempting the steps in this article.
Step 2: Discover and Monitor Cache/Buffer Memory Usage Tests
- Represent into the department level
- Navigate to Administration -> Devices
- Locate the device in question and click on "Tests"
- Click on "Create New Standard Tests"
- Select the radio button for 'Create new tests by selecting specific monitors'
- Select the snmp monitor and select 'Add Tests' and click 'Continue' on the next page
- On the discovered test list, have only the following tests check-marked and click 'Provision Selected Tests':
"Buffer Memory Usage"
"Cache Memory Usage"
Step 3: Rename Existing 'Physical Memory Usage' Test and Update Thresholds
- Navigate to Administration -> Devices
- Locate the device in question and click on "Tests"
- Locate the "Physical Memory Usage" and click on modify icon
- Change the test name to "Total Memory Usage"
- Change warning and critical thresholds to 100
- Take note (e.g. CTRL-C) of the Maximum Value field
- Click on "Submit"
Step 4: Verify Correct Maximum Values for Buffer and Cache Memory Usage Tests
- Navigate to Administration -> Devices
- Locate the device in question and click on "Tests"
- Locate the "Buffer Memory Usage" and click on modify icon
- From the previous step (from original Physical Memory Usage test), make sure the Maximum Value is the same from the Physical Memory Usage
- Update the Maximum Value if needed and click on "Submit"
- Repeat the process for "Cache Memory Usage"
Step 5: Calculate Real Memory Utilization
- Navigate to Administration -> Devices
- Locate the device in question and click on "Tests"
- Click on "Create New Advanced Tests" (available only after representing into department per Step 2)
- Enable "Composite Test" and set the following parameters:
Test Name: Physical Memory Usage
Warning Threshold: 85
Critical Threshold: 95
- Click on "Add" (Child Tests)
- From the pop-up, select "Buffer Memory Usage", "Cache Memory Usage" and "Total Memory Usage"
- Click on "Add Tests" (pop-up closes)
- In "Expression" field, enter the corresponding variables so that the expression equates to:
Total Memory Usage - Buffer Memory Usage - Cache Memory Usage
E.g. Given:
"Total Memory Usage" is T1
"Buffer Memory Usage" is T2
"Cache Memory Usage" is T3
The expression should be "T1 - T2 - T3"
- Click on "Provision Tests"
- Modify the newly created test and update the 'Units' field from 'custom' to '%'
Now navigate to Status -> Devices and drill-down into the device in question. Within a few minutes, you should see the correct memory utilization (without the portion used by Buffer and Cache Memory) reflected in the (composite) "Physical Memory Usage" test.
Further reading:
http://www.linuxatemyram.com
https://sites.google.com/a/thetnaing.com/therunningone/how-to-calculate-systems-memory-utilization