In-depth understanding of JVM - performance monitoring tools (seven)




使用TOP命令查看当前系统的CPU/内存以及相关的进程状态 通过TOP命令可以详细看出当前系统的CPU、内存、负载以及各进程状态(PID、进程占用CPU、内存、用户)等。从上面的结果看出该系统上安装了MySQL、java,可以看到他们各自的进程ID,假如这时Java进程占用较高的CPU和内存,那么你就要留心了,如果程序中没有计算量特别大、占用内存特别多的代码,可能你的java程序出现了未知的问题,可以根据进程ID做进一步的跟踪。除了通过TOP命令找到进程信息以外,还可以通过jdk自带的工具JPS直接找到java程序的进程号。


在这里插入图片描述 可以看到jps命令直接罗列出了当前系统中存在的java进程,这里第一个是jps命令自己的java进程,而另外一个是我启动的nosql监控工具进程。通过这种方法查询到java程序的进程ID后,可以进一步通过:


在这里插入图片描述 上面的命令每隔1秒采样一次,一共采样四次。CPU占用率很高,上下文切换频繁,说明系统有线程正在频繁切换,这可能是你的程序开启了大量的线程存在资源竞争的情况。另外swap也是值得关注的指标,如果swpd过高则可能系统能使用的物理内存不足,不得不使用交换区内存,还有一个例外就是某些程序优先使用swap,导致swap飙升,而物理内存还有很多空余,这些情况是需要注意的。

yum install sysstat

在这里插入图片描述 pidstat还有其他的参数,可以通过pidstat --help获取,再次不再赘述。



@@ First, it is necessary to have certain principle knowledge as the basis, and second, it is necessary to master the process and method of troubleshooting and solving problems. This article will introduce the use of performance monitoring tools to help developers find the root cause of problems faster and more accurately. This article is divided into three parts, the first part will introduce the common monitoring tools in the Linux environment, the second part introduces the monitoring tools in the Windows environment, the third part will introduce the use of these monitoring tools to find out the java application step by step through a case study. The problem. The

Linux environment monitoring tool

needs to be declared first. Some of the tools described below can be used in both the Linux environment and the Windows environment, but it is more appropriate to use any tools in different environments.

Below we imagine such a scenario: one day the operation and maintenance personnel saw that the server load in the production environment increased, the CPU soared, and the memory usage increased. What should he do next? Some people may say that it will find that the load is rising and the CPU is soaring. . The reason, if it is caused by the application, kill it. If this method is used, the problem may be solved in a short time, but the problem is that the application is not available for a while, and we have not found the root cause of the problem. After the next restart, the problem will still occur. So what if a similar problem arises and the strict troubleshooting process? Before answering this question, let's take a look at several monitoring tools commonly used on Linux: 图片描述

Double-click on the left side of the java program you need to monitor to monitor it. This tool includes monitoring CPU, memory, threads, and classes. It is very powerful. All the functions described above have been used on this tool. With. Of course, how to use it and how to analyze it takes time to accumulate a little.

MemoryAnalyzer.exe: As we mentioned above, it is often used to analyze memory heap usage and is a very powerful tool. Detailed use, no more details here, you can download and try it out.

The above describes the monitoring tools based on Linux and Windows environment. With these tools, we will use them to do the corresponding things. Here is a simple case to explain how to use them.


First of all, through the above introduction, we should have an impression on the troubleshooting process. Here we will sort it out:


After a java application is started, use personnel The application is found to be unavailable. For the phenomenon, we do the following analysis:

1, first check the application status on the server. Use the jps command to query the currently running java process:

The java application with the process ID of 6400 is just enabled, indicating that the application is not hanged and still running.

2, query the CPU, memory and current load situation occupied by the process ID, top -p 6400. 图片描述

From the above results, the application did not cause the system load to be too high, and there was no abnormality in the CPU and memory.

3, through the above results, we speculate that the fault caused by memory can be less performance, so we prioritize the thread stack and use the jstack command to export the thread stack.

jstack 6400 > stack.out

We will transfer the file for easy viewing. 在这里插入图片描述 查看线程栈可以看出,主线程处于运行状态,而子线程ThreadA、ThreadB、ThreadC、ThreadD一边在等待一个锁,同时又持有另外一个锁,其实看到这里我们基本推断该应用程序存在死锁,因此造成线程等待,应用不可用。通过以上栈的信息,我们就可以到程序代码中详细查看代码了,并且修改bug解决此问题。

死锁 principle added:


As shown in the figure, the cause of the deadlock is that there is a mutual constraint between the threads, and no thread can continue to execute.


This article introduces the commonly used monitoring tools in Linux and Windows environments. Finally, a case is used to explain the troubleshooting process and how to use the monitoring tools to find out the cause of the application failure.