SCAWG: A Toolbox for Generating Synthetic Workload for Spatial Crowdsourcing

Existing studies in mobile crowdsourcing (aka spatial crowdsourcing), a hot research area in recent years, face the problem of lacking real­world datasets. We thus published a synthetic dataset generator for producing common datasets for mobile crowdsourcing.

The toolbox can generate synthetic workload patterns based on the spatial (location) and temporal (time) distributions of workers and tasks. As shown in the figure below, it also takes into account the various real-world constraints, such as worker region and worker capacity, worker activeness and temporal workload.


Link to the toolbox


Hien To, Mohammad Asghari, Dingxiong Deng, and Cyrus Shahabi, SCAWG: A Toolbox for Generating Synthetic Workload for Spatial Crowdsourcing, In Proceeding of International Workshop on Benchmarks for Ubiquitous Crowdsourcing: Metrics, Methodologies, and Datasets (CROWDBENCH 2016), Sydney, Australia, March 14-18, 2016


Spatial Crowdsourcing and Applications

Spatial crowdsourcing is a new mobile platform which extends crowdsourcing beyond the digital domain and link it to tasks in the physical world. One of my favorite examples to explain spatial crowdsourcing is a real story in Beijing, China. A young Chinese girl, named Ling Yifan, initiated a love campaign to make granpa’s last days delightful moments (he was diagnosed with limph cancer). She created a campaign on the Internet, “Taking Grandpa Around the World”. Consequently, she received 20,000+ replies with photos of her grandfather’s portrait at many beautiful places around the world, such as Switzerland, Italy, Germany, San Francisco. As a result, her grandfather lived his last days with joys and fun, watching beautiful pictures. Let come back to this example later.

So what is spatial crowdsourcing? let break the term into three parts 1) outsourcing 2) crowd and 3) spatial.

First of all, outsourcing is the contract out of an internal business process to a third-party organization (wiki). For example, Apple ships the task of making iphone cases to China due to cheaper labors or Japanese software companies outsource parts of their softwares development to Vietnam for the same reason. More examples can be found in the book “The World is Flat”.

Second, why crowd? There is a whole research area in crowd-related topics, such as group think, crowdfunding, crowdsourcing. Most of those research were based on the claim that a group of people is more intelligent than individuals because of the diversity of ideas. This probably a reason why we have group meetings, group discussions, etc, to collect intelligence from people or generally the crowd. A book named “The Wisdom of Crowds” discusses this. From my understanding, the concept of crowdsourcing is essentially taking the idea of crowd to the outsourcing business. In 2006, Jeff Howe first tossed the term crowdsourcing in Wired Magazine. He defined crowdsourcing as a process of outsourcing is the act of a company taking a function performed by employees and outsourcing it to an undefined (and generally large) network of people in the form of an open call. There are various examples of crowdsourcing examples. and are of the biggest crowdsourcing Internet marketplaces. The idea of these crowdsourcing platforms is to utilize human computation to perform tasks that computers are currently have not been able to solve effectively yet, such as labeling pictures. Other examples could be, which is a company outsourcing the tasks of designing t-shirt to the crowd (they made money out of it) or, whose idea is to outsource the task of solving scientific problems to the crowd.

Finally, how does the spatial aspect play a role in crowdsourcing? To make it simple, spatial means a location with latitude and longitude. With spatial crowdsourcing, the users need to physically present at the task location in order to perform the task. That is, the users not only interact with each others on the Internet but also on the physical world. A main reason makes crowdsourcing a successful business model is the popularity and convenience of Internet; gathering workforce from the crowd becomes easy than ever. However, with spatial crowdsourcing, physically travel to a location is an impediment to the practicality and applications of spatial crowdsourcing. People are not likely to drive 30 miles to do some simple tasks. It just does not work. To make spatial crowdsourcing applications work, the tasks shouldn’t take much time from the users. One way to make this happen is to allow users to solve tasks while traveling. So the potential users of spatial crowdsourcing are the ones who travel a lot. An example spatial crowdsourcing is, which is a free GPS navigation app on iphone/android with spatial-crowdsourced features. The users (drivers) can report traffic jam, accident or police so that the other users who are driving ahead this road be aware of those incidents in advance. Another feature is to report the prices of gas stations (in U.S you can save a few bucks by going to a cheap gas station). Note that the users can do these tasks without much effort. Other successful examples include but not limited to Uber – taxi ride-sharing app, TaskRabbit, GigWalk – crowdsourcing household tasks or recently Google shopping express.

The question is why suddenly spatial crowdsourcing become popular?

The reason is that smartphones are now so popular and there are many sensors within them. Also, the network quality is getting higher, like 4G LTE. Those three enable us to develop SC applications listed in the Figure below.


Examples of Spatial Crowdsourcing Applications

Recently, we developed an app named iRain [1] that utilizes spatial crowdsourcing technology to enable human workers to report precipitation condition, particularly rain level/no-rain observation to improve real-time global satellite precipitation estimation. Basically, researchers can specify a set of locations they want rain information, our system crowdsource the tasks (i.e., a set of locations) to nearby users using push notifications. When notified users get the tasks, they just need to report, let say “heavy rain”. All the reports from users are then usable for the researchers.

[1] Hien To, Liyue Fan, Luan Tran, and Cyrus Shahabi, Real-Time Task Assignment in Hyperlocal Spatial Crowdsourcing under Budget Constraints, In Proceeding of IEEE International Conference on Pervasive Computing and Communications (PerCom 2016), Sydney, Australia, March 14-18, 2016