Babar is a profiler for java applications developed to profile large-scale distributed applications such as Spark, Scalding, MapReduce or Hive programs.
Babar registers metrics about memory, cpu, garbage collection usage, as well as method calls in each individual JVM and then aggregate them over the entire application to produce ready-to-use graphs of the resource usage and method calls (as flame-graphs) of the program as shown in the screenshots section below.
Currently babar is designed to profile YARN applications, but could be extended in order to profile other types of applications.
Babar is composed of two main components:
- babar-agent
- babar-processor
The babar agent is a java-agent
program. An agent is a jar that can be attached to a JVM in order to intrument this JVM. The agent fecthes, at regular interval, information on the resource comsumption and logs the resulting metrics in a plain text file named babar.log
inside the YARN log directory. YARN's log aggregation at the end of the application them combine all the executors logs into a single log file on HDFS.
The babar-processor is the piece of software responsible for parsing the aggregated log file from the YARN application and aggregating the metrics found in them to produce the graphs. the logs are parsed as streams which allows the babar-processor to aggregate large logs files (dozens of GB) without needing to load them in memory entirely at once.
Once the babar-processor has run, a new directory is created containing two HTML files containing the graphs (memory, CPU usage, GC usage, executor counts, flame-graphs,...).
the babar-agent instuments the JVM to register and log the resource usage metrics. It is a standard java-agent
component (see the instrumentation API doc for more information).
In order to add the agent to a JVM, add the following arguments to the java command line used to start you application:
-javaagent:/path/to/babar-agent.jar=StackTraceProfiler[profilingMs=100,reportingMs=60000],MemoryProfiler[profilingMs=5000,reservedMB=1024],CPUTimeProfiler[profilingMs=5000]
You will need to replace /path/to/babar-agent.jar
with the actual path of the agent jar on your system. This jar must be locally accessible to your JVM (i.e. distributed on all your YARN nodes).
the profilers can be added and configured using this command line. The profilers and their configuration are described bellow.
3 profilers are available:
-
CPUTimeProfiler
: this profiler registers and logs CPU usage and GC activity metrics at a regular interval. This interval can be configured using theprofilingMs
option in its arguments (e.g.CPUTimeProfiler[profilingMs=5000]
will add the profiler and make it register metrics every 5 seconds) -
MemoryProfiler
: this profiler registers metrics about memory (heap and off-heap used and committed memory) as well as reserved memory for the containers. The frequency of the profiling can be adjusted withprofilingMs
, and the amount of reserved memory for the executor can be indicated withreservedMB
-
StackTraceProfiler
: This profilers registers the stach traces of allRUNNABLE
threads at regular intervals (theprofilingMs
options) and logs them at another interval (thereportingMs
option) in order to aggregate multiple traces before logging them to save space in the logs. The traces are always logged at the JVM shutdown so one can set the repoting interval very high in order to save the most space in the logs if they are not interested in having traces logged in case the JVM is killed or fails.
The babar-processor is the piece of software that parses the logs and aggregates the metrics into graphs.
The processor needs to parse the application log aggregated by YARN, either from HDFS or from a local log file that has been fecthed using the following command (replace the application id with yours):
yarn logs --applicationId application_1514203639546_124445
To run the babar-processor, the following command can be used:
java -jar /path/to/babar-processorT.jar -l myAppLog.log
The processor accepts the following arguments:
-l, --log-file <arg> the log file to open (REQUIRED)
-c, --containers <arg> if set, only metrics of containers matching
these prefixes are aggregated
(comma-separated)
-o, --output-dir <arg> path of the output dir (default: ./output)
-t, --time-precision <arg> time precision (in ms) to use in aggregations
(default: 10000)
-m, --traces-min-ratio <arg> min ratio of trace samples
to show trace in graph
(default: 0.001)
-p, --traces-prefixes <arg> if set, traces will be aggregated only from
methods matching the prefixes
(comma-separated, eg: org.mygroup)
In the output dir (by default ./output
), two HTML files containing the graph will be generated: memory-cpu.html
and traces.html
.
No code changes are required to instrument a Spark job since Spark allows to distribute the agent jar archive to all containers using the --files
command argument.
In order to instrument your Spark application, simply add these arguments to your spark-submit
command:
--files ./babar-agent-1.0-SNAPSHOT.jar
--conf spark.executor.extraJavaOptions="-javaagent:./babar-agent-1.0-SNAPSHOT.jar=StackTraceProfiler[profilingMs=100,reportingMs=60000],MemoryProfiler[profilingMs=5000,reservedMB=7175],CPUTimeProfiler[profilingMs=5000]"
You can adjust the reserved memory setting according to the spark.executor.memory + spark.yarn.executor.memoryOverhead
.
You can then use the yarn logs
command to get the aggregated log file and process the logs using the babar-processor.
If the jar is already distributed on your nodes at /path/to/babar-agent-1.0-SNAPSHOT.jar
, then you only need to add some command line arguments to your Scalding application command as below:
-Dmapreduce.map.java.opts="-javaagent:/path/to/babar-agent-1.0-SNAPSHOT.jar=StackTraceProfiler[profilingMs=100,reportingMs=60000],MemoryProfiler[profilingMs=5000,reservedMB=2500],CPUTimeProfiler[profilingMs=5000]"
-Dmapreduce.reduce.java.opts="-javaagent:/path/to/babar-agent-1.0-SNAPSHOT.jar=StackTraceProfiler[profilingMs=100,reportingMs=60000],MemoryProfiler[profilingMs=5000,reservedMB=3500],CPUTimeProfiler[profilingMs=5000]"
You can adjuste the reserved memory value for mappers and reducers independently. This value can also be programmatically determined. You will find an example on how to instrument a job to determine these values and set the configuration programmatically in the babar-scalding
module.
You will find an example on how to distribute an agent jar to all the containers whemn starting the application and instrument a job in the babar-scalding
module.
Similarly to Spark, hive allows to easily distribute the jar to the executors. To profile a Hive application, simply execute the following commands:
ADD FILE /home/b.hanotte/babar-agent-1.0-SNAPSHOT.jar;
SET mapreduce.map.java.opts="-javaagent:./babar-agent-1.0-SNAPSHOT.jar=StackTraceProfiler[profilingMs=100,reportingMs=60000],MemoryProfiler[profilingMs=5000,reservedMB=2560],CPUTimeProfiler[profilingMs=5000]";
SET mapreduce.reduce.java.opts="-javaagent:./babar-agent-1.0-SNAPSHOT.jar=StackTraceProfiler[profilingMs=100,reportingMs=60000],MemoryProfiler[profilingMs=5000,reservedMB=3684],CPUTimeProfiler[profilingMs=5000]";