巧妙定位ANR问题

96
karlsu
2.1 2016.07.27 16:37* 字数 655

想必做过Android开发的童鞋都遇到过anr(Application Not Responding)。一般的做法就是去/data/anr/trace.txt取对应的 文件。然后,再从这一堆庞大的堆栈信息找到自己想要的结果。但是拿到一堆arn堆栈信息很抓狂,如何从这庞大的堆栈信息找到自己的问题代码呢?这篇文章,给大家分享下如何取定位anr问题。

首先看下堆栈信息

07-26 17:51:04.442: E/ActivityManager(2258): ANR in com.pafinancialtech.zijinbank (com.pafinancialtech.zijinbank/com.pingan.fstandard.paff.activity.MainActivity)
07-26 17:51:04.442: E/ActivityManager(2258): PID: 28560
07-26 17:51:04.442: E/ActivityManager(2258): Reason: Input dispatching timed out (Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago.  Wait queue length: 11.  Wait queue head age: 5573.8ms.)
07-26 17:51:04.442: E/ActivityManager(2258): Load: 5.15 / 3.68 / 3.38
07-26 17:51:04.442: E/ActivityManager(2258): CPU usage from 602ms to -6091ms ago:
07-26 17:51:04.442: E/ActivityManager(2258):   65% 2258/system_server: 14% user + 50% kernel / faults: 9262 minor
07-26 17:51:04.442: E/ActivityManager(2258):   47% 28560/com.pafinancialtech.zijinbank: 10% user + 37% kernel / faults: 4197 minor
07-26 17:51:04.442: E/ActivityManager(2258):   11% 4292/com.android.systemui: 5.9% user + 5.6% kernel / faults: 2762 minor
07-26 17:51:04.442: E/ActivityManager(2258):   9.2% 4819/com.android.phone: 3.1% user + 6.1% kernel / faults: 3972 minor
07-26 17:51:04.442: E/ActivityManager(2258):   1.9% 4774/com.vivo.daemonService: 0.6% user + 1.3% kernel / faults: 3167 minor
07-26 17:51:04.442: E/ActivityManager(2258):   3.5% 4618/com.baidu.map.location: 1.7% user + 1.7% kernel / faults: 3635 minor
07-26 17:51:04.442: E/ActivityManager(2258):   3.5% 28690/com.pafinancialtech.zijinbank:remote: 2.3% user + 1.1% kernel / faults: 2368 minor
07-26 17:51:04.442: E/ActivityManager(2258):   0% 4580/debuggerd64: 0% user + 0% kernel / faults: 2492 minor
07-26 17:51:04.442: E/ActivityManager(2258):   0.4% 4842/com.vivo.abe: 0.3% user + 0.1% kernel / faults: 1455 minor
07-26 17:51:04.442: E/ActivityManager(2258):   0% 4705/com.android.server.telecom: 0% user + 0% kernel / faults: 1677 minor
07-26 17:51:04.442: E/ActivityManager(2258):   0% 4867/com.vivo.fingerprint.fpc.service: 0% user + 0% kernel / faults: 1456 minor
07-26 17:51:04.442: E/ActivityManager(2258):   0% 4735/com.qualcomm.qti.services.secureui:sui_service: 0% user + 0% kernel / faults: 1503 minor
07-26 17:51:04.442: E/ActivityManager(2258):   1.3% 4165/adbd: 0.1% user + 1.1% kernel / faults: 72 minor
07-26 17:51:04.442: E/ActivityManager(2258):   0% 28661/debuggerd: 0% user + 0% kernel / faults: 820 minor
07-26 17:51:04.442: E/ActivityManager(2258):   0.5% 24757/logcat: 0.1% user + 0.4% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0.5% 28524/kworker/u16:3: 0% user + 0.5% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0.5% 28909/kworker/u16:4: 0% user + 0.5% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0.4% 181/cfinteractive: 0% user + 0.4% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0.4% 26204/logcat: 0.4% user + 0% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0.1% 4215/VosMCThread: 0% user + 0.1% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0.1% 4217/VosRXThread: 0% user + 0.1% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0.1% 5999/core_ctl/0: 0% user + 0.1% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0.2% 28467/irq/13-synaptic: 0% user + 0.2% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0% 3/ksoftirqd/0: 0% user + 0% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0.1% 6/kworker/u16:0: 0% user + 0.1% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0% 7/migration/0: 0% user + 0% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0% 8/rcu_preempt: 0% user + 0% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0% 19/migration/3: 0% user + 0% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0% 20/ksoftirqd/3: 0% user + 0% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0.1% 35/kworker/u17:0: 0% user + 0.1% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0% 207/mmcqd/0: 0% user + 0% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0.1% 296/surfaceflinger: 0.1% user + 0% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0% 611/kworker/3:2: 0% user + 0% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0% 4216/VosTXThread: 0% user + 0% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0% 5985/migration/5: 0% user + 0% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   0% 6033/perfd: 0% user + 0% kernel / faults: 4 minor
07-26 17:51:04.442: E/ActivityManager(2258):   0% 27164/kworker/6:0: 0% user + 0% kernel
07-26 17:51:04.442: E/ActivityManager(2258): 0.-4% TOTAL: 0% user + 0.-1% kernel + 0.-3% iowait
07-26 17:51:04.442: E/ActivityManager(2258): CPU usage from 5196ms to 5716ms later:
07-26 17:51:04.442: E/ActivityManager(2258):   23% 4292/com.android.systemui: 13% user + 9.6% kernel / faults: 305 minor
07-26 17:51:04.442: E/ActivityManager(2258):     23% 4308/Signal Catcher: 13% user + 9.6% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   5.7% 2258/system_server: 1.9% user + 3.8% kernel / faults: 6 minor
07-26 17:51:04.442: E/ActivityManager(2258):     1.9% 2346/ActivityManager: 1.9% user + 0% kernel
07-26 17:51:04.442: E/ActivityManager(2258):     1.9% 3244/SensorService: 0% user + 1.9% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   1.4% 35/kworker/u17:0: 0% user + 1.4% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   1.6% 4215/VosMCThread: 0% user + 1.6% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   1.9% 4618/com.baidu.map.location: 1.9% user + 0% kernel / faults: 823 minor
07-26 17:51:04.442: E/ActivityManager(2258):     1.9% 4647/GCDaemon: 1.9% user + 0% kernel
07-26 17:51:04.442: E/ActivityManager(2258):    +0% 28983/Thread-37982: 0% user + 0% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   1.7% 28524/kworker/u16:3: 0% user + 1.7% kernel
07-26 17:51:04.442: E/ActivityManager(2258):   1.7% 28560/com.pafinancialtech.zijinbank: 1.7% user + 0% kernel / faults: 7 minor
07-26 17:51:04.442: E/ActivityManager(2258):   1.8% 28690/com.pafinancialtech.zijinbank:remote: 1.8% user + 0% kernel / faults: 274 minor
07-26 17:51:04.442: E/ActivityManager(2258):     1.8% 28723/ServiceStartArg: 1.8% user + 0% kernel
07-26 17:51:04.442: E/ActivityManager(2258): 0.1% TOTAL: 0% user + 0% kernel + 0.1% iowait

从上面的堆栈信息,我们搜索关键字anr,可以找到

07-26 17:51:04.442: E/ActivityManager(2258): ANR in com.pafinancialtech.zijinbank (com.pafinancialtech.zijinbank/com.pingan.fstandard.paff.activity.MainActivity)

解读:在07-26 17点51分04秒442毫秒的时候 ActivityManager (进程号为2258) 发生了如下错误:com.pafinancialtech.zijinbank 包下面的MainActivity 无响应

第三行

Reason: Input dispatching timed out (Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago.  Wait queue length: 11.  Wait queue head age: 5573.8ms.)

解读:keyDispatchingTimeOut - 按键分配超时

第四行

07-26 17:51:04.442: E/ActivityManager(2258): Load: 5.15 / 3.68 / 3.38

解读:5分钟,10分钟,15分钟内的平均负载分别为:5.15 , 3.68 , 3.38

第五行

07-26 17:51:04.442: E/ActivityManager(2258): CPU usage from 602ms to -6091ms ago:

解读:cpu从602毫秒到6091毫秒之间的使用情况,也就是后面输出的一堆log信息。

至此,我们可以了解大概的原因了,程序肯定有地方做了耗时操作,但是具体在哪部分代码出现的问题呢,继续从堆栈信息查找,然并未找到突破口。既然堆栈信息没有可用信息了,我们再分析虚拟机信息 ,找到从手机pull出来的anr文件, 搜索“Dalvik Thread”关键词,快速定位到本应用程序的虚拟机信息日志,如下:

DALVIK THREADS (91):
"main" prio=5 tid=1 Native
  | group="main" sCount=1 dsCount=0 obj=0x745bd000 self=0xab1951a8
  | sysTid=24311 nice=-6 cgrp=apps sched=0/0 handle=0xf716dec8
  | state=S schedstat=( 4172755721 1784708550 9755 ) utm=340 stm=77 core=0 HZ=100
  | stack=0xff6cf000-0xff6d1000 stackSize=8MB
  | held mutexes=
  kernel: __switch_to+0x70/0x7c
  kernel: binder_thread_read+0x464/0xe8c
  kernel: binder_ioctl+0x3f8/0x824
  kernel: compat_sys_ioctl+0x10c/0x11fc
  kernel: cpu_switch_to+0x48/0x4c
  native: #00 pc 00039cc4  /system/lib/libc.so (__ioctl+8)
  native: #01 pc 0003f4bd  /system/lib/libc.so (ioctl+14)
  native: #02 pc 0001ccd3  /system/lib/libbinder.so (android::IPCThreadState::talkWithDriver(bool)+190)
  native: #03 pc 0001d347  /system/lib/libbinder.so (android::IPCThreadState::waitForResponse(android::Parcel*, int*)+42)
  native: #04 pc 0001d509  /system/lib/libbinder.so (android::IPCThreadState::transact(int, unsigned int, android::Parcel const&, android::Parcel*, unsigned int)+124)
  native: #05 pc 0001872b  /system/lib/libbinder.so (android::BpBinder::transact(unsigned int, android::Parcel const&, android::Parcel*, unsigned int)+30)
  native: #06 pc 00082b53  /system/lib/libandroid_runtime.so (???)
  native: #07 pc 00a80891  /system/framework/arm/boot.oat (Java_android_os_BinderProxy_transactNative__ILandroid_os_Parcel_2Landroid_os_Parcel_2I+144)
  at android.os.BinderProxy.transactNative(Native method)
  at android.os.BinderProxy.transact(Binder.java:496)
  at android.location.ILocationManager$Stub$Proxy.getFromLocation(ILocationManager.java:777)
  at android.location.Geocoder.getFromLocation(Geocoder.java:133)
  at bonree.com.bonree.agent.android.harvest.f.a(unavailable:-1)
  at bonree.com.bonree.agent.android.harvest.j.onLocationChanged(unavailable:-1)
  at android.location.LocationManager$ListenerTransport._handleMessage(LocationManager.java:281)
  at android.location.LocationManager$ListenerTransport.access$000(LocationManager.java:210)
  at android.location.LocationManager$ListenerTransport$1.handleMessage(LocationManager.java:226)
  at android.os.Handler.dispatchMessage(Handler.java:102)
  at android.os.Looper.loop(Looper.java:135)
  at android.app.ActivityThread.main(ActivityThread.java:5305)
  at java.lang.reflect.Method.invoke!(Native method)
  at java.lang.reflect.Method.invoke(Method.java:372)
  at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:922)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:717)

"Heap thread pool worker thread 0" prio=5 tid=2 Native (still starting up)
  | group="" sCount=1 dsCount=0 obj=0x0 self=0xab1bffd0
  | sysTid=24316 nice=0 cgrp=apps sched=0/0 handle=0xab3f9a28
  | state=S schedstat=( 7943593 7965519 46 ) utm=0 stm=0 core=4 HZ=100
  | stack=0xf4ca7000-0xf4ca9000 stackSize=1020KB
  | held mutexes=
  kernel: __switch_to+0x70/0x7c
  kernel: futex_wait_queue_me+0xd4/0x12c
  kernel: futex_wait+0xe0/0x1d4
  kernel: do_futex+0xc8/0x8d0
  kernel: compat_SyS_futex+0xd0/0x14c
  kernel: cpu_switch_to+0x48/0x4c
  native: #00 pc 0000f618  /system/lib/libc.so (syscall+28)
  native: #01 pc 000a9487  /system/lib/libart.so (art::ConditionVariable::Wait(art::Thread*)+82)
  native: #02 pc 00230897  /system/lib/libart.so (art::ThreadPool::GetTask(art::Thread*)+50)
  native: #03 pc 0023083d  /system/lib/libart.so (art::ThreadPoolWorker::Run()+52)
  native: #04 pc 00231079  /system/lib/libart.so (art::ThreadPoolWorker::Callback(void*)+52)
  native: #05 pc 00012f1b  /system/lib/libc.so (__pthread_start(void*)+30)
  native: #06 pc 00010fdf  /system/lib/libc.so (__start_thread+6)
  (no managed stack frames)

"Heap thread pool worker thread 1" prio=5 tid=3 Native (still starting up)
  | group="" sCount=1 dsCount=0 obj=0x0 self=0xab19bb18
  | sysTid=24317 nice=0 cgrp=apps sched=0/0 handle=0xab19ae08
  | state=S schedstat=( 7940266 9912290 36 ) utm=0 stm=0 core=4 HZ=100
  | stack=0xf4ba5000-0xf4ba7000 stackSize=1020KB
  | held mutexes=
  kernel: __switch_to+0x70/0x7c
  kernel: futex_wait_queue_me+0xd4/0x12c
  kernel: futex_wait+0xe0/0x1d4
  kernel: do_futex+0xc8/0x8d0
  kernel: compat_SyS_futex+0xd0/0x14c
  kernel: cpu_switch_to+0x48/0x4c
  native: #00 pc 0000f618  /system/lib/libc.so (syscall+28)
  native: #01 pc 000a9487  /system/lib/libart.so (art::ConditionVariable::Wait(art::Thread*)+82)
  native: #02 pc 00230897  /system/lib/libart.so (art::ThreadPool::GetTask(art::Thread*)+50)
  native: #03 pc 0023083d  /system/lib/libart.so (art::ThreadPoolWorker::Run()+52)
  native: #04 pc 00231079  /system/lib/libart.so (art::ThreadPoolWorker::Callback(void*)+52)
  native: #05 pc 00012f1b  /system/lib/libc.so (__pthread_start(void*)+30)
  native: #06 pc 00010fdf  /system/lib/libc.so (__start_thread+6)
  (no managed stack frames)

"Heap thread pool worker thread 2" prio=5 tid=4 Native (still starting up)
  | group="" sCount=1 dsCount=0 obj=0x0 self=0xab3530a8
  | sysTid=24318 nice=0 cgrp=apps sched=0/0 handle=0xab3539c0
  | state=S schedstat=( 4911303 7280680 38 ) utm=0 stm=0 core=5 HZ=100
  | stack=0xf4aa3000-0xf4aa5000 stackSize=1020KB
  | held mutexes=
  kernel: __switch_to+0x70/0x7c
  kernel: futex_wait_queue_me+0xd4/0x12c
  kernel: futex_wait+0xe0/0x1d4
  kernel: do_futex+0xc8/0x8d0
  kernel: compat_SyS_futex+0xd0/0x14c
  kernel: cpu_switch_to+0x48/0x4c
  native: #00 pc 0000f618  /system/lib/libc.so (syscall+28)
  native: #01 pc 000a9487  /system/lib/libart.so (art::ConditionVariable::Wait(art::Thread*)+82)
  native: #02 pc 00230897  /system/lib/libart.so (art::ThreadPool::GetTask(art::Thread*)+50)
  native: #03 pc 0023083d  /system/lib/libart.so (art::ThreadPoolWorker::Run()+52)
  native: #04 pc 00231079  /system/lib/libart.so (art::ThreadPoolWorker::Callback(void*)+52)
  native: #05 pc 00012f1b  /system/lib/libc.so (__pthread_start(void*)+30)
  native: #06 pc 00010fdf  /system/lib/libc.so (__start_thread+6)
  (no managed stack frames)

日志较多,截取一部分,因为我们要分析的是main线程,所以丢掉部分线程代码并不影响我们分析。
每一段都是一个线程,当然我们还是看main线程。通过分析发现突破口在这里:

  native: #06 pc 00082b53  /system/lib/libandroid_runtime.so (???)
  native: #07 pc 00a80891  /system/framework/arm/boot.oat (Java_android_os_BinderProxy_transactNative__ILandroid_os_Parcel_2Landroid_os_Parcel_2I+144)
  at android.os.BinderProxy.transactNative(Native method)
  at android.os.BinderProxy.transact(Binder.java:496)
  at android.location.ILocationManager$Stub$Proxy.getFromLocation(ILocationManager.java:777)
  at android.location.Geocoder.getFromLocation(Geocoder.java:133)
  at bonree.com.bonree.agent.android.harvest.f.a(unavailable:-1)
  at bonree.com.bonree.agent.android.harvest.j.onLocationChanged(unavailable:-1)
  at android.location.LocationManager$ListenerTransport._handleMessage(LocationManager.java:281)
  at android.location.LocationManager$ListenerTransport.access$000(LocationManager.java:210)
  at android.location.LocationManager$ListenerTransport$1.handleMessage(LocationManager.java:226)
  at android.os.Handler.dispatchMessage(Handler.java:102)
  at android.os.Looper.loop(Looper.java:135)
  at android.app.ActivityThread.main(ActivityThread.java:5305)
  at java.lang.reflect.Method.invoke!(Native method)
  at java.lang.reflect.Method.invoke(Method.java:372)
  at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:922)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:717)

可以看到程序调用了Geocoder类的getFromLocation方法,难道系统api也有bug?带着疑问,我们看下Geocoder的源码,如下:

ff39318b51641df88fbf90c468222142.png

从官方文档的介绍来看,getFromLocation方法需要通过网络查找来获取结果,最好在主线程之外的线程来调用这个方法。所以,最可能的原因就是我们程序在主线程调用了这个方法,然后移步到程序调用的地方,发现如下代码

 public final void a(Location var1) {
        if(var1 == null) {
            this.p.d("Location must not be null");
        } else {
            Geocoder var2 = new Geocoder(this.q);
            List var3 = null;

            try {
                var3 = var2.getFromLocation(var1.getLatitude(), var1.getLongitude(), 1);
            } catch (IOException var4) {
                this.p.d("Unable to geocode location: " + var4.toString());
            }

            if(var3 != null && var3.size() != 0) {
                Address var5;
                if((var5 = (Address)var3.get(0)) != null) {
                    String var7 = var5.getCountryCode();
                    String var6 = var5.getAdminArea();
                    if(var7 != null && var6 != null) {
                        if(var7 == null || var6 == null) {
                            this.p.d("Country code and administrative region are required.");
                        }

                        this.l();
                    }

                }
            }
        }
    }

原来确实是程序在主线程调用了此方法,导致主线程被阻塞,发生的ANR.

Android