XLDB attempts to tackle challenges related to extreme scale data sets. The definition of extreme scale is a moving target, current limits are in petabyte range. Main activities include (a) identifying trends, commonalities and roadblocks related to building and managing extreme scale systems and analyzing extreme scale data sets, (b) bridging the gap between data-intensive users and solution providers, and (c) facilitating development and growth of appropriate technologies including (but not limited to) databases.

Close to a thousand people worldwide participate. The subset of the community gathers each year at the annual XLDB event, typically organized at Stanford in September or October.


Extremely large databases, today storing petabytes of information, pose challenges beyond those of smaller installations. To address these challenges, in 2007, Scalable Data Systems team at SLAC, responsible for designing and building a 100-petabyte data access system for the astronomical survey called LSST organized an invitation-only workshop with three goals:

  • Identifying trends, commonalities, and major roadblocks related to building extremely large databases.
  • Bridging the gaps between extremely large database users, researchers, and solution providers.
  • Facilitating development and growth of practical technologies for extremely large databases.

Invited participants included leading scientific and industrial users, academic researchers, and solution providers. Since then, the XLDB event has grown to merit larger forums: in 2010 an open conference has been introduced, drowning a crowd of 150, and the 2011 event drew a crowd of 300 attendees. Both events reached full capacity and registration had to be cut off early.

These events helped start several initiatives carried on by the community, including collecting use cases, and defining a science benchmark. The open-source SciDB project has been started as a result of the 1st XLDB workshop. A series of satellite workshops in Europe has been initiated, starting with the workshop in Edinburgh in 2011.

The XLDB activities continue to be coordinated by the originators, the SLAC Scalable Data Systems team (Jacek Becla, Kian-Tat Lim and Daniel Wang).