There are many odd tweaks in the Android system, and you will often encounter some strange memory leaks. I recently came across a relatively rare app leak and am writing it here.
At first it was a weapon used by Android Studio: Profiler
The application runs for about two hours. The sexy picture is as follows. No apparent memory increase is found, but the following crash occurs:
02-04 13:15:03.661 1070 1087 W libc : pthread_create failed: couldn't allocate 1069056-bytes mapped space: Out of memory
02-04 13:15:03.661 1070 1087 W art : Throwing OutOfMemoryError "pthread_create (1040KB stack) failed: Try again"
E mali_so : encounter the first mali_error : 0x0002 : failed to allocate CPU memory (gles_texturep_upload_3d_internal at hardware/arm/maliT760/driver/product/gles/src/texture/mali_gles_texture_upload.c:1030)
02-04 13:15:18.664 1070 1627 E OpenGLRenderer: GL error: Out of memory!
02-04 13:15:18.665 1070 1627 F OpenGLRenderer: GL errors! frameworks/base/libs/hwui/renderthread/CanvasContext.cpp:550
Can not set less than 1M of memory? Thanks to the following command, it appears that the system still has more than 500M of free memory.
cat /proc/meminfo
MemTotal: 2045160 kB
MemFree: 529064 kB
MemAvailable: 1250020 kB
Buffers: 1300 kB
Cached: 891916 kB
SwapCached: 0 kB
Active: 556296 kB
Inactive: 674200 kB
Active(anon): 235800 kB
Inactive(anon): 224668 kB
Active(file): 320496 kB
Inactive(file): 449532 kB
Unevictable: 256 kB
Mlocked: 256 kB
It’s weird. I wrote two scripts here, one is to automatically click the test process, and the other is to use the system, application memory, and the associated handle during the test.
Written test:
#!/system/bin/sh
COUNT=1
FILE=/sdcard/top.txt
#input tap 800 500
#input text 1599828
#input tap 1600 500
echo "Auto test start:"
PROCESS=$1
if [ "$1" == "" ];then
echo "./auto_test.sh should with procecess name"
exit
fi
PID=`pidof $PROCESS`
echo "start test $PROCESS,pid is $PID:"
echo "======================================================="
while(true)
do
echo $COUNT : `date`
# input tap 1800 900
# input tap 800 500
#input text 1599828
#input tap 1600 500
#input keyevent 66
#swich to face login
#input tap 800 330
procrank | grep -E "$PROCESS|RAM"
cat /proc/meminfo | grep -A 2 MemTotal:
echo "------------------------------------------------------"
sleep 2
input tap 1000 700
sleep 6
input tap 1900 80
sleep 2
#confirm button ok
input tap 1150 750
if [ ! -d "/proc/$PID" ];then
TIME=$(date "+%Y%m%d_%H%M%S")
BUG_REPORT=/sdcard/bugreport_${TIME}_${PID}.txt
echo "$PROCESS is died at:" `date`
echo "save bugreport to:$BUG_REPORT"
bugreport > $BUG_REPORT
exit
fi
COUNT=$(expr $COUNT + 1 )
done;
Part of the writing text is as follows:
#!/system/bin/sh
INTERVAL=60
ENABLE_LOG=true
PID=$1
TIME=$(date "+%Y%m%d-%H%M")
WORK_DIR=/sdcard/rk/$TIME
MEMINFO_DIR=$WORK_DIR/meminfo
LSOF_DIR=$WORK_DIR/lsof
LOG_DIR=$WORK_DIR/log
SYSTEM_MEM_DIR=$WORK_DIR/system_mem
STATUS_DIR=$WORK_DIR/status
THREAD_DIR=$WORK_DIR/thread
PROCRANK_DIR=$WORK_DIR/procrank
FD_DIR=$WORK_DIR/fd
PS=$WORK_DIR/ps.txt
LSOF=$WORK_DIR/lsof.txt
INFO=$WORK_DIR/info.txt
COUNT=1
mkdir -p $WORK_DIR
mkdir -p $MEMINFO_DIR
mkdir -p $LSOF_DIR
mkdir -p $LOG_DIR
mkdir -p $SYSTEM_MEM_DIR
mkdir -p $STATUS_DIR
mkdir -p $THREAD_DIR
mkdir -p $PROCRANK_DIR
mkdir -p $FD_DIR
#echo `date >> $LOG`
#echo `date >> $SLAB`
PROCESS_NAME=`cat /proc/$1/cmdline`
#set -x
if [ $1 ]; then
echo "================================================"
echo "Track process: $PROCESS_NAME,pid: $1 "
echo "Start at : `date`"
PID_EXIST=` ps | grep -w $PID | wc -l`
if [ $PID_EXIST -lt 1 ];then
echo "Pid :$1 not exsit!"
exit 1
fi
if [ ! -r /proc/$PID/fd ];then
echo "You should run in root user."
exit 2
fi
else
echo "You should run with track process pid!($0 pid)"
exit 4
fi
echo "Update logcat buffer size to 2M."
logcat -G 2M
echo "Save record in: $WORK_DIR"
echo Record start at:`date` >> $INFO
echo "$PROCESS_NAME,pid is:$1" >> $INFO
echo -------------------------------------------------->> $INFO
echo "Current system info:" >> $INFO
echo /proc/sys/kernel/threads-max: >> $INFO
cat /proc/sys/kernel/threads-max >> $INFO
echo /proc/$1/limits: >> $INFO
cat /proc/$1/limits >> $INFO
while((1));do
NOW=`date`
if [ ! -d "/proc/$PID" ];then
echo "$PROCESS_NAME is died,exit proc info record!"
echo -------------------------------------------------->> $INFO
echo "Record total $COUNT times." >> $INFO
logcat -d >> $LOG_DIR/last_log.txt
cp -rf /data/tombstones $WORK_DIR/
TIME=$(date "+%Y%m%d_%H%M%S")
BUG_REPORT=$WORK_DIR/bugreport_${TIME}_${PID}.txt
echo "save bugreport to:$BUG_REPORT"
bugreport > $BUG_REPORT
exit
fi
NUM=`ls -l /proc/$PID/fd | wc -l`
TIME_LABEL="\n$NOW:--------------------------------------------------:$COUNT"
echo -e $TIME_LABEL >> $PS
`ps >> $PS`
echo -e $TIME_LABEL >> $MEMINFO_DIR/${COUNT}_meminfo.txt
dumpsys meminfo $1 >> $MEMINFO_DIR/${COUNT}_meminfo.txt
echo -e $TIME_LABEL >> $SYSTEM_MEM_DIR/${COUNT}_sys_meminfo.txt
cat /proc/meminfo >> $SYSTEM_MEM_DIR/${COUNT}_sys_meminfo.txt
echo -e $TIME_LABEL >> $PROCRANK_DIR/${COUNT}_procrank.txt
procrank >> $PROCRANK_DIR/${COUNT}_procrank.txt
COUNT=$(expr $COUNT + 1 )
sleep $INTERVAL
done
The problem appeared to check about two hours. Compared with the information recorded in the above scenario, no leakage of the pen was found. Cat / Proc / Meminfo Display System Available memory is not significantly reduced. When comparing PS results from different time periods, some results:
u0_a9 1070 217 2057984 379264 SyS_epoll_ b5fd37a4 S com.cnsg.card
u0_a9 1070 217 2663308 581404 binder_thr b5fd38e8 S com.cnsg.card
After comparing Procrank’s processing after finding the time period, I found that:
The VSS part of the above process continues to grow. Search for information:
VMRSS can’t judge memory leaks, VMSIZE can
Typical memory leaks often increase VMSize and VMRS at the same time, and a memory leak can be found by watching VMRS (it probably doesn’t use something like Malloc); But we don’t say that VMRS has a memory leak.
In fact, monitoring VMSIZE is reasonable. The reason is as follows:
VMSIZE is full memory (file mapping, memory share, heap, any other memory, that has VMRS), its change doesn’t seem “too fast”, CH ___ MGR settles at 58824 K long ago, because no, because there is no no malloc / Free, VMSize will not go up
VMRSS is the physical memory that is actually being used. Because of business needs, this is reasonable.
For example, VMSIZE is the amount of assets the administrator owns (the sum of all major asset holdings, bank deposit money, house money, etc.)
VMRSS is the money that official puts into the house
We are now tracking that this administrator messed up the money and we should be tracking all of his VMSIZE assets soon
Obviously, it is funny that officials take money from the bank and put it in the house as corruption
It can be confirmed that a virtual memory leak has occurred. Then the next step is to confirm this process which is causing the virtual memory leak. We didn’t have this application code and we analyzed the deadlock. Then keep looking for hacks from /proc/1070//.
In the case of /proc/1070/, several process states are written as follows:
***/proc/1070 # cat status
Name: com.cnsg.card
State: S (sleeping)
Tgid: 1070
Ngid: 0
Pid: 1070
PPid: 343
TracerPid: 0
Uid: 10062 10062 10062 10062
Gid: 10062 10062 10062 10062
FDSize: 128
Groups: 3003 9997 50062
VmPeak: 3619748 kB
VmSize: 3531764 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 878692 kB
VmRSS: 487580 kB
VmData: 1317136 kB
VmStk: 8192 kB
VmExe: 16 kB
VmLib: 177524 kB
VmPTE: 3488 kB
VmPMD: 32 kB
VmSwap: 0 kB
Threads: 48
SigQ: 0/15299
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000001204
SigIgn: 0000000000000000
SigCgt: 20000002000084f8
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000000000000000
CapAmb: 0000000000000000
Seccomp: 0
Cpus_allowed: 30
Cpus_allowed_list: 4-5
Mems_allowed: 1
Mems_allowed_list: 0
voluntary_ctxt_switches: 619990
nonvoluntary_ctxt_switches: 88065
Explanation above:
Пользовательский процесс записывает реальную ситуацию памяти в файле/proc/{pid}/status.
* VmSize:
Размер виртуальной памяти.
Весь процесс использует размер виртуальной памяти, которая представляет собой сумму VMLIB, VMEXE, VMDATA и VMSTK.
* VmLck:
Виртуальная блокировка памяти.
Общее количество виртуальной памяти, используемой в настоящее время в процессе
* VmRSS:
Коллекция урегулирования виртуальной памяти.
Это часть физической памяти. Это не обменивается на жесткий диск. Он включает код, данные и стек.
* VmData:
Данные виртуальной памяти.
Виртуальная память, используемая для кучи.
* VmStk:
Виртуальный стек памяти
Виртуальная память, используемая в стеке
* VmExe:
Исполняемая виртуальная память
Виртуальная память, используемая в библиотеке исполняемой и статической ссылки
* VmLib:
Библиотека виртуальной памяти
Виртуальная память, используемая в библиотеке динамических ссылок
Then continue to run the script and at the same time this process is also added to the watch. After a while, I found that in addition to changes related to the VM, the threads are constantly growing, and the test is paused, and the threads will not go down. Guys, it seems that the process continues to create the process, but it is not killed. Also, I want to know that the thread is being created continuously. It is used here: Pstree command
busybox pstree 1070
com.cnsg.card-+-{Binder:15994_1}
|-{Binder:15994_2}
|-{Binder:15994_3}
|-{Binder:15994_4}
|-{Binder:15994_5}
|-{FinalizerDaemon}
|-{FinalizerWatchd}
|-{HeapTaskDaemon}
|-{JDWP}
|-{Jit thread pool}
|-{Profile Saver}
|-{ReferenceQueueD}
|-{RenderThread}
|-{Signal Catcher}
|-{Thread-11}
|-{Thread-12}
|-{Thread-15}
|-2*[{Thread-2}]
|-{Thread-4}
|-{Thread-5}
|-{Thread-6}
|-{Thread-7}
|-{Thread-8}
|-{Thread-9}
|-{hwuiTask1}
|-{hwuiTask2}
|-{mali-cmar-backe}
|-{mali-hist-dump}
|-{mali-mem-purge}
|-{RxCachedThreadS}(1927)
|-{RxCachedThreadS}(1928)
|-{RxCachedThreadS}(2140)
|-{RxCachedThreadS}(2141)
|-{RxCachedThreadS}(2289)
|-{RxCachedThreadS}(2290)
|-{RxCachedThreadS}(2458)
|-{RxCachedThreadS}(2464)
|-{RxCachedThreadS}(2465)
|-{RxCachedThreadS}(2614)
|-{RxCachedThreadS}(2615)
|-{RxCachedThreadS}(2792)
|-{RxCachedThreadS}(2793)
|-{RxCachedThreadS}(2958)
|-6*[{mali-utility-wo}]
|-10*[{myHandlerThread}]
`-{com.cnsg.card}
By comparing different periods, it is found that the type of thread such as rxcachedthread is increasing. At this point, the reason is clear. There is code to create threads continuously in the process. When creating, you need to apply for memory space, including virtual and physical. Virtual memory is full here. Error.