User data protection regulations in most countries and regions have clear requirements for the collection of personal information. Apps must declare reasonable use scenarios and obtain the user’s consent when collecting relevant data. Many enterprises have invested greatly to guarantee the privacy policy compliance of their apps. However, it is still a challenging problem when third-party SDKs are imported.
In particular:
- The app cannot comprehensively monitor the information collection behaviors of the embedded SDKs, for example, what data is collected and when the data is collected and uploaded.
- It is difficult for app developers to analyze the information collection behaviors of the third-party SDKs, due to lack of expertise.
- The third-party SDKs evolve fast, such that it is unrealistic to conduct manual privacy audit for each version.
To solve the above problems, we develop a static taint analyzer against SDKs, based on Facebook’s open source tool Mariana Trench and our own tool.
Our analyzer sorts out the locations of all sensitive information calls as source points and the locations of all network interfaces as sink points. It solves the challenge of asynchronous invocation that undermines of existing analyzer. Addressing this challenge, our analyzer manages to achieve accuracy of 95%, recall rate of 71.26%, and F-measure of 83.22%. We apply our analyzer to some mainstream apps, and find that some of them miss disclosing collected user information due to their embedded SDKs. We open source our analyzer, to benefit app developers.