Notable Events (High Memory Utilization)

Hi Team,

We are getting error as " High Memory Utilization in Core Component ". We are not able to find any guide to troubleshoot the issue. Please help us on fixing this issue.

Regards.

Hi Sourav,

Could you please check which process is taking more memory on the core server by using the below commands?

$ htop
$ ps -o pid,user,%mem,command ax | sort -b -k3 -r

Also, please share the below snapshots.

  1. Manage Component page
  2. Core Component page
  3. Datanode Component page

Regards,
Ben

Hi Ben,

Thanks for the response,

Please find the output of the commands below:

socadmin@dnif-core-vm:~$ ps -o pid,user,%mem,command ax | sort -b -k3 -r
PID USER %MEM COMMAND
17950 root 7.5 /usr/lib/jvm/java-14-openjdk-amd64/bin/java -Dproc_namenode -Djava.net.preferIPv4Stack=true -Dhdfs.audit.logger=WARN,NullAppender -Dhadoop.security.logger=WARN,RFAS -Dyarn.log.dir=/opt/hadoop-3.2.1/logs -Dyarn.log.file=hadoop-root-namenode-dnif-core-vm.vpsoc.com.log -Dyarn.home.dir=/opt/hadoop-3.2.1 -Dyarn.root.logger=INFO,console -Djava.library.path=/opt/hadoop-3.2.1/lib/native -Dhadoop.log.dir=/opt/hadoop-3.2.1/logs -Dhadoop.log.file=hadoop-root-namenode-dnif-core-vm.vpsoc.com.log -Dhadoop.home.dir=/opt/hadoop-3.2.1 -Dhadoop.id.str=root -Dhadoop.root.logger=WARN,RFA -Dhadoop.policy.file=hadoop-policy.xml org.apache.hadoop.hdfs.server.namenode.NameNode
1130095 root 7.1 /usr/lib/jvm/java-14-openjdk-amd64/bin/java -cp /dnif/correlation_server/conf/:/opt/spark-3.1.1-bin-hadoop3.2/jars/* -Xmx3g -Dderby.system.home=/dnif/correlation_server/derby org.apache.spark.deploy.SparkSubmit --master spark://172.16.40.236:7077 --conf spark.executor.memory=4g --conf spark.sql.hive.filesourcePartitionFileCacheSize=2000000000 --conf spark.driver.memory=3g --conf spark.sql.files.ignoreMissingFiles=true --conf spark.sql.parquet.mergeSchema=true --conf spark.driver.maxResultSize=1g --conf spark.sql.files.ignoreCorruptFiles=true --conf spark.executor.cores=2 --conf spark.sql.files.maxPartitionBytes=512000000 --conf spark.sql.shuffle.partitions=1 --conf spark.cores.max=4 --conf spark.driver.extraJavaOptions=-Dderby.system.home=/dnif/correlation_server/derby --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --name Thrift JDBC/ODBC Server --name correlation_server spark-internal --hiveconf hive.server2.thrift.port=10001
1130386 root 3.4 /usr/lib/jvm/java-14-openjdk-amd64/bin/java -cp /dnif/report_server/conf/:/opt/spark-3.1.1-bin-hadoop3.2/jars/* -Xmx3g -Dderby.system.home=/dnif/report_server/derby org.apache.spark.deploy.SparkSubmit --master spark://172.16.40.236:7077 --conf spark.executor.memory=4g --conf spark.driver.memory=3g --conf spark.sql.parquet.mergeSchema=true --conf spark.driver.maxResultSize=1g --conf spark.sql.files.ignoreCorruptFiles=true --conf spark.executor.cores=2 --conf spark.sql.files.maxPartitionBytes=512000000 --conf spark.sql.shuffle.partitions=100 --conf spark.sql.hive.filesourcePartitionFileCacheSize=2000000000 --conf spark.sql.files.ignoreMissingFiles=true --conf spark.sql.thriftServer.incrementalCollect=true --conf spark.cores.max=2 --conf spark.driver.extraJavaOptions=-Dderby.system.home=/dnif/report_server/derby --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --name Thrift JDBC/ODBC Server --name report_server spark-internal --hiveconf hive.server2.thrift.port=10002
1130922 root 15.4 /usr/lib/jvm/java-14-openjdk-amd64/bin/java -cp /dnif/query_server/conf/:/opt/spark-3.1.1-bin-hadoop3.2/jars/* -Xmx3g -Dderby.system.home=/dnif/query_server/derby org.apache.spark.deploy.SparkSubmit --master spark://172.16.40.236:7077 --conf spark.executor.memory=3g --conf spark.sql.hive.filesourcePartitionFileCacheSize=2000000000 --conf spark.driver.memory=3g --conf spark.sql.files.ignoreMissingFiles=true --conf spark.sql.parquet.mergeSchema=true --conf spark.driver.maxResultSize=1g --conf spark.sql.files.ignoreCorruptFiles=true --conf spark.executor.cores=2 --conf spark.sql.files.maxPartitionBytes=512000000 --conf spark.sql.shuffle.partitions=1200 --conf spark.cores.max=12 --conf spark.driver.extraJavaOptions=-Dderby.system.home=/dnif/query_server/derby --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --name Thrift JDBC/ODBC Server --name query_server spark-internal --hiveconf hive.server2.thrift.port=10000
16366 root 10.8 /usr/bin/python3 /usr/local/bin/gunicorn --certfile=/dnif/ssl/dnif_https_API.crt --keyfile=/dnif/ssl/dnif_https_API.key --bind 0.0.0.0:8090 --worker-class gevent -w 3 --pythonpath /usr/src/nm/core-v9/api_service/ dispatcher:app --log-level DEBUG
16342 root 10.8 /usr/bin/python3 /usr/local/bin/gunicorn --certfile=/dnif/ssl/dnif_https_API.crt --keyfile=/dnif/ssl/dnif_https_API.key --bind 0.0.0.0:8090 --worker-class gevent -w 3 --pythonpath /usr/src/nm/core-v9/api_service/ dispatcher:app --log-level DEBUG
16344 root 10.7 /usr/bin/python3 /usr/local/bin/gunicorn --certfile=/dnif/ssl/dnif_https_API.crt --keyfile=/dnif/ssl/dnif_https_API.key --bind 0.0.0.0:8090 --worker-class gevent -w 3 --pythonpath /usr/src/nm/core-v9/api_service/ dispatcher:app --log-level DEBUG
17983 root 0.8 /usr/lib/jvm/java-14-openjdk-amd64/bin/java -cp /opt/spark-3.1.1-bin-hadoop3.2/conf/:/opt/spark-3.1.1-bin-hadoop3.2/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host 172.16.40.236 --port 7077 --webui-port 8080

Screenshots:

  1. Manage Component page


  2. Core component page:



  3. Datanode component page:

Thanks,
Mohammed Abrar

Hi Adrar,

I think the shared command output is incomplete. Could you please redirect the command output to one file and share it with us.

Command:
$ ps -o pid,user,%mem,command ax | sort -b -k3 -r > /var/tmp/corememory.txt
(Share the corememory.txt from /var/tmp)

Share the $ htop command screenshot too.

Hi Ben,

Screenshot of htop command:

ps -o pid,user,%mem,command ax | sort -b -k3 -r > /var/tmp/corememory.txt output:

Was not able to share the text document here hence shared the screenshot.

Thanks