hero image

Applying Unusual Data to Find Housing Vacancy in China

research by Syntia

Data analysts should have the ability to use the data collected by private companies for a public good. High vacancy rates have become a serious concern in many Chinese cities, especially in those experiencing an economic slowdown, which some say started as early as 2008 during the global economic crisis. The phenomenon, often called “ghost cities” usually describes whole cities of vacant properties; but it can also mean pockets of existing second and third-tier cities with high vacancies as a result of poor planning that results in lack of access to jobs or amenities.

The Civic Data Design Lab (CDDL) set out to test whether it was possible to measure vacancy by using data downloaded from Chinese APIs. Over the course of three months in the fall of 2016 the team developed code to download data from several of China’s top social media sites: Dianping, Amap, Fang, and Baidu.

So, in order to get the data, they sent thousands of latitude-and-longitude (‘lat/long’) points (roughly fifty meters apart) and invoked multiple API user keys on timed-request intervals to avoid reaching calling limits. The function invoked POIs near a given lat/long point and a specification of the type of interest point (residential location, restaurant etc.) and returned a string of data that included the type of POI, its name, lat/long and street address and collected in a database. Once the information was successfully downloaded for the test in city Chengdu, they captured data for six more second- and third-tier cities: Shenyang, Changchun, Hangzhou, Tianjin, Wuhan and Xi’an. 

The Data Action method requires data analysts to check the accuracy of their analysis with qualitative data. In the Ghost Cities project we used drones, surveys and photographs on visits to Chengdu, Tianjin, and Shenyang. The team also interviewed local people in those cities. These site visits proved that our model identified all types of underutilized land, which we defined as residential locations that were not meeting development potential, either because the land was unoccupied or construction was stalled. It was important to shed light on these areas since they clearly needed intervention from local actors, including the government,to make them viable places to live.

A large portion of our results identified residential locations that may have been developed five years earlier. The government informed us that they believed people would eventually move to those sites, but that is unlikely to happen without access to schools and other amenities. In time, they say, the amenities will come. We further determined a number of locations where ground had been broken but construction had stalled; semi-vacant housing often surrounded these undeveloped sites. We found vacated properties, remnants of the various building styles indicative of China’s communist past; for instance, tower complexes lay vacant and with no one living there, and the amenities that once supported them no longer exist. Finally, our team identified one entire satellite city outside Shenyang that lay largely vacant.

One element of the Data Action principles is that models are not complete until reviewed by the people described in them. In the Ghost Cities project, CDDL researchers shared visualizations of the results with local stakeholders, urban planners, real estate developers, and other researchers, asking for their impressions and feedback. Yixue Jiao, the senior urban planner at the China Academy of Urban Planning and Design, said: “As one might expect, vacancy is controversial; while many planners know [ghost cities] exist, they come from directives from higher levels of government,” making it hard for local planners to push back their development. Although aware of the situation, he felt there was a little he could do to change the government’s course. Some urban planners are interviewed and discuss how the decisions for where to build seldom make use of big data to analyze appropriate locations; they use “theories’ ‘ and “book knowledge” of what the market might sustain, rather than what is currently observed on the ground. While the planners recognized the accuracy of our results, they felt limited ability to change the course, even with the data analytics that were presented: many urban planning decisions in China are made by the central government and filtered down for local organizations to enact. It is often a one-way conversation.

A few real estate developers explained how the vacancies we exposed also illustrate China’s real estate bubble: if it burst, they believe, citizens encouraged to carry multiple mortgages would feel the greatest impact. Chinese investors don’t always understand their mortgage risks, because banks are officially tied to the government, and there are reasons for them to insulate investors from potential risk. In other words, the government will certainly bail out banks for bad loans.

Developers acknowledged that the maps illustrate the mismatch between the supply and demand in the Chinese housing market, but explained that the economics of the housing market in China works differently than it does in other places in the world. For the Chinese government, the economic benefits, such as job creation associated with building the development, outweigh the five- to ten-year lag in people purchasing the homes. Researchers believed that addressing oversupply was essential to ensuring a healthy economy and healthy cities: having many cities remain partially vacant increases the difficulty of attracting residents without major government interventions.

It is important to note that the Chinese government could certainly estimate the extent of vacancy from data only it has access to (such as electricity records), but it does not share this information with local-level officials who could use the data to make decisions about local development. The government controls the release of data because it wants to steer the course of development. Even though there has been a move toward a more market-based real estate development economy in China, from the perspective of the Chinese government, sharing the data with real estate developers could have unknown outcomes and, ultimately, the possibility that the Chinese government could lose the control over housing development, a risk they are unwilling to take. Real estate developers therefore have to rely on the government to tell them where to build and the risk for developers whose projects don’t work well is minimal because the government has an incentive to bail them out. This is because China’s banking industry is closely intertwined with its government. Therefore, the government doesn’t want developers to default on their loans because it would cause instability in China’s economy.

Sarah Williams, Data Action