BLOG 16 September 2021

Analyzing Social Media to Spot Digital Consumer Credit Risks in India

Eric Duflos, Daryl Collins, Jayshree Venkatesan, Juan Carlos Izaguirre

It was May 2020 when Newsclick ran an article with the headline: “Suicide Deaths Mount after Unregulated Lending Apps Resort to Exploitative Recovery Practices.” This article was among the first in an array of news stories that prompted The Reserve Bank of India and others to take a closer look at India’s growing number of digital consumer credit apps. According to new CGAP research, the warning signs of aggressive debt collection could have been detected even before the articles came out by analyzing social media.

Mobile phones glow in the night along a riverbank in India. — Photo: Supratim Bhattachargee, 2017 CGAP Photo Contest

As we described in our last blog post, "Digital Consumer Credit in India: Time to Take a Closer Look," India is seeing a digital credit boom. There are close to 200 digital consumer credit apps on the market today, even after Google removed 30 apps from its Play store. These apps are making it easier to borrow money in India than it is to report abusive lending practices through the country’s official channels. As a result, many customers take to social media to share their experiences, warn others and voice their complaints. However, supervisors are not currently monitoring social media in any systematic way to look for early warning signs of consumer protection risks.

To test the potential of monitoring social media, CGAP partnered with Daryl Collins and her team to analyze over 150,000 digital credit-related Google Play reviews between May 2016 and May 2021 and Twitter posts between January 2020 and April 2021. We wanted to explore whether this kind of supervisory technology (suptech) tool for market monitoring could help supervisors and others to listen to consumer voices and get a sense of their experience using digital credit apps, beyond what can be gleaned in the news. We relied on natural language processing (NLP), or the ability of computers to process and understand text written by humans, to analyze the data. We chose to analyze activity on Twitter because it is accessible, widely used and publicly available. We also analyzed Google Play reviews because we could link the reviews to downloads of specific digital credit apps, which helped us to understand whether complaints indicated a market-wide problem or issues with specific apps. Our detailed findings are available here.

NLP allowed us not only to rapidly analyze data to identify posts that fit a pre-defined list of consumer risks, but to flag as urgent a subset of those posts that used highly charged language (like references to suicide, pleading or cursing). Out of the 150,000 posts we analyzed, over 25,000 were complaints. Notably, 45% of Twitter complaints and 15% of Google Play complaints pertained to aggressive collection practices. At 23% of Twitter complaints and 28% of Google Play complaints, claims of “fake apps” were also relatively common. As the chart below shows, many of the complaints were tagged as “urgent.”

Social media complaints by issue

Overall, the data shows that online complaints about aggressive debt collection started climbing in late 2019, well before the media coverage began, offering an early warning sign to financial supervisors that this issue might warrant a closer look. It should be noted that compared to the estimated 88 million users of these apps, the total number of complaints related to this and other pre-defined risks was low; we estimate a complaint-to-user ratio of 0.25%. However, this was just a pilot, and we only went through one round of coding and refining the rules to detect consumer complaints. A true market monitoring exercise with this tool would require several rounds on a larger sample of text, and may find a higher percentage of complaints. In addition, we assume that only a small percentage of discontent users write a post online. The early spike in aggressive lending complaints presented an early warning sign of what the news media reported the following year.

Google Play reviews related to aggressive lending practices by digital credit apps

Column chart of monthly complaint data shows that social media complaints preceded news coverage on aggressive lending practices by digital credit apps. — Click on image to enlarge.

The results do not clearly tell us whether the complaints about aggressive collection and other issues indicate market-wide consumer protection concerns or more isolated problems with a few “bad apps.” But the tool does tell us that a quarter of all the complaints show urgency, and it shows us which apps are the most criticized in the market. We believe that a few bad apples could easily spoil the entire basket and have long-term impact on consumers’ trust in digital credit.

There are some limitations to NLP social media monitoring. It excludes complaints by financial services users who are not digitally literate enough to post on social media. The number of complaints cannot be equated with the number of people who’ve experienced problems, since the same person can post multiple times. And without additional information about users, supervisors would not be able to use it to zero in on the most vulnerable segments, learn specifically about their experiences, and take action to better protect them.

But as the pilot results demonstrate, NLP social media monitoring can be a useful complement to more traditional tools that financial supervisors use to monitor digital consumer risks and customer outcomes, such as surveys, mystery shopping and analysis of regulatory reporting. It can be particularly useful when it comes to monitoring unregulated or semi-regulated providers that fly below the radar of supervisors. It can be more agile and less expensive than other tools. And NLP allows for easy identification of trends over time, unlike similarly inexpensive one-off phone surveys.

Few financial supervisors have the skills to build and apply an NLP social media monitoring tool in-house. For most supervisors, it will be important to bring in a specialized technical service provider and to be a good partner in the development and application of the tool. Supervisors should have a minimum understanding of the market and milestone events that can influence social media activity, a preliminary idea of the scope and depth of information they want to obtain through NLP, and terms of reference that clearly defines requisites, expectations and deliverables. Although it is preferable that the vendor be knowledgeable on financial sector issues, this may not be necessary if the supervisor is prepared to be more hands-on in the development of the tool.

While it has several limitations, NLP social media analysis could be an agile, inexpensive way to help supervisors monitor the market and listen to the collective voice of consumers.

Topic: Enabling and Responsible Financial Policy

Sub-topics: Consumer Protection

Regions:

South Asia

Countries:

India