Admin: Graphs

From Resin 3.0

(Difference between revisions)
Jump to: navigation, search
 
Line 28: Line 28:
 
# Type "threads" in the Meter Save Name form and select "Save Meters". Saving the meters will add "threads" as a predefined meter group in the Meters selection at the top.
 
# Type "threads" in the Meter Save Name form and select "Save Meters". Saving the meters will add "threads" as a predefined meter group in the Meters selection at the top.
  
[[Image: Graphs-threads]]
+
[[Image: Graphs-threads.png]]
  
 
= Meters =
 
= Meters =

Latest revision as of 23:28, 14 October 2010


Contents

[edit] Graphs

The Graphs tab in the /resin-admin gives you a view of the meter data collected by Resin across the cluster. The Statistics service that gathers the Meter data is available in Resin Professional.

[edit] Graph Browsing

When looking at the server statistics, you first need to select meters to display. Resin's meters are the statistics data streams gathered every minute and stored by the triad servers. Some meters are JMX attributes collected over time, and others record data from Resin's embedded sensors.

Selecting a meter will add the meter to the graph. Once you've selected a group of meters, you can save them as a named meter set using meter save.

[edit] Meter Names

Meter names follow a standard convention: "00|Author|Group|Name". The "00" is the server index in the cluster. Author is soemthing like "JVM", "OS", or "Resin", or "MyCom" for custom meters.

[edit] Server Groups

Graphs for servers in the cluster can be displayed in three basic modes: single server, one graph, or multiple graphs. In the single graph mode, each meter for each server has its own graph line. In the multiple graph mode, each server gets its own graph. The multiple-graph mode is more useful comparisons across the cluster.

[edit] Cookbook: setting up a thread graph

  1. Clear all the meters by clicking the "Clear Meters" button on the right.
  2. Open the "JVM|Thread" group to find the recorded data from the JVM's own thread count.
  3. Select "JVM Thread Count". You should see a graph of the JVM's thread count in the graph. You can use the "Time" selector to change the timescale to use.
  4. Open the "Resin|Thread" group for the meters in Resin's own thread pool.
  5. Select all the meters in the "Resin|Thread" group. You should see a graph with about 4 lines visible and the rest at zero.
  6. Type "threads" in the Meter Save Name form and select "Save Meters". Saving the meters will add "threads" as a predefined meter group in the Meters selection at the top.

Graphs-threads.png

[edit] Meters

The predefined meters are in three groups: JVM, OS, and Resin.

  • JVM is data from the JVM's JMX beans, like thread counts and garbage collection.
  • OS is data from the operating system, like CPU counts.
  • Resin is data from Resin's JMX and sensors

[edit] JVM|Compilation

The JVM compilation group measure JIT compilation times as reported by the JVM.

Compilation Time
the time taken for garbage collection in the last 60 seconds.

[edit] JVM|Memory

The JVM's memory and garbage collection information is useful when tuning memory and checking for memory leak situations, and checking that GC time is in a reasonable range.

GC Time|PS MarkSweep
the GC time taken in the last 60 seconds for full mark-sweep collection as reported by the JVM.
GC Time|PS Scavenge
the GC time taken for short GC scavenging as reported by the JVM.
Heap Memory Free
free heap memory in bytes as reported by the JVM
Heap Memory Used
total allocated memory in the heap
Loaded Classes
total number of classes loaded by the JVM
PermGen Memory Free
memory free in the "perm gen" group, used for .class data
PermGen Memory Used
allocated memory in the perm gen pool.
Tenured Memory Free
free memory in the long-term tenured heap
Tenured Memory Used
allocated memory in the long-term tenured heap

[edit] JVM|Thread

The JVM's thread group reports the total number of threads in the JVM.

JVM Thread Count
The total threads in the JVM

[edit] OS|CPU

The CPU load as reported by the JVM. This report is different for different operating systems. On Linux, the CPU is reported for each CPU and combined.

Unix Load Avg
On Unix systems (non-Linux), reports the system's Load Average. The load average is the count of runnable processes; it's not directly a CPU load measure

[edit] OS|Memory

The OS|Memory group report operating system memory.

Physical Memory Free
The physical free memory as reported in JMX.
Swap Free
The free swap as reported in JMX.

[edit] OS|Process

Process-related information as reported by the OS.

File Descriptor Count
The number of open files and sockets in the JVM process

[edit] Resin|Cache

The cache statistics include both the proxy cache and Resin's underlying block cache, which is also used for distributed sessions.

Block Miss Count
How many times the low-level block cache missed, causing Resin to read or write from disk
Block Read Count
The count of blocks read in the last 60 seconds.
Block Write Count
How many blocks were written to disk in the last 60 seconds.
Proxy Cache Hit Count
How many requests successfully used the proxy cache in the last 60 seconds.
Proxy Cache Miss Count
How many cacheable requests failed to find a valid page in the proxy cache in the last 60 seconds.

[edit] Troubleshooting

A high block read or write count may indicate that the block cache is too small. Since the purpose of the block cache is to reduce the slow filesystem reads and writes, high block reads and writes means Resin is spending more time reading and writing files.

[edit] Resin|Cluster

The Cluster group measures outgoing connections to other servers in the Resin system. This measurement is similar to the heartbeat since it counts cluster connections.

There are separate meters for each outgoing server. So server #2 will have data going to servers #0 and #1.

Connection Active|NN cluster-id
Measure the current active connections from this server to a target server named by "NN:cluster-id"
Connection Count|NN cluster-id
Counts the number of connections created in the last 60s from this server to the target server.
Idle Active|NN cluster-id
Counts the current idle connections in the pool from this server to the target server.
Idle Count|NN cluster-id
Counts the number of transitions to the idle state.
Request Active|NN cluster-id
The current number of active requests from this server to the target server
Request Count|NN cluster-id
The number of requests to the target server in the last 60s.
Request Fail|NN cluster-id
The number of failed requests to the target server in the last 60s
Request Time|NN cluster-id
The average request time for requests to the target server in the last 60s.
Request Time Max
NN cluster-id
The longest request time for a request to the target server in the last 60s
Request Time 95%
NN cluster-id
The time for 95% of requests to complete

[edit] Resin|Database

The data for the Resin database pool lets you tune the pool, and check for slow database queries.

Query Active
Counts the current number of active queries.
Query Count
Counts the queries in the last 60s
Query Time
The average query time in the last 60s
Query Time Max
The maximum query time in the last 60s
Query Time 95%
The time for 95% of queries to complete
Connection Active
The current number of active database connections
Connection Count
The number of connections created in the last 60s
Connection Time
The average open time for connections in the last 60s.
Idle Active
The current number of idle connections in the database pool
Idle Count
The number of connections changing to the idle state in the last 60s
Idle Time
The average time connections are idle for the last 60s.

[edit] Resin|Health

Each health check in Resin's health system records its current status as a meter. Since the "OK" level is zero, a stable system has a zero graph. The warning level is 1 and the fail level is 2.

[edit] Resin|Http

HTTP requests and sessions are recorded in the Resin|Http section, letting you check for slow requests and unexpected HTTP session sizes.

Request Active
The current number of active requests
Request Bytes
The number of bytes transferred in the last 60s
Request Count
The number of requests in the last 60s
Request Time
The average time for a request in the last 60s
Request Time Max
The slowest request in the last 60s
Request Time 95%
The time for 95% of requests to complete
Session Save Count
The number of sessions saved in the last 60s
Session Save Size
The average serialized session size in the last 60s

[edit] Resin|Thread

Statistics related to the Resin thread pool, used for requests and timers.

Thread Active Count
The number of threads currently active
Thread Count
The total number of threads managed by Resin
Thread Create Count
The threads created by Resin in the last 60s
Thread Idle Count
The current number of threads idle in the pool.
Thread Overflow Count
The number of threads Resin created using the overflow method.
Thread Priority Queue
The number of threads dispatching the priority queue
Thread Starting Count
The current number of threads starting
Thread Task Queue
The number of threads reading from the task queue
Thread Wait Count
The requests waiting for an active thread

[edit] Troubleshooting

  • The thread create count should generally be very low, and preferably zero. If the create count is high, the pool isn't being effective.
  • The overflow count should be zero unless the pool is overflowing.
  • The Priority Queue and Task Queue counts should generally be zero unless there's a thread spike.
Personal tools