test

Computing Technologies and Human Aspects research cluster

TRADE-OFF: A trade-off framework for globally distributed NoSQL data storages

The purpose of the project is to develop a framework allowing developers of large-scale distributed systems to interplay between Energy Consumption, Latency, Availability, Durability and Consistency of distributed data storages like Cassandra NoSQL.

TRADE-OFF: A trade-off framework for globally distributed NoSQL data storages

Cassandra trade-off models and assessment

Facebook operates five data centres including 180000+ servers. Everyday data increase ~ 500 TB; Number of everyday user accesses ~ 1 million.

Very large systems like Facebook will partition at some point due to node failures (e.g. Google reports MTTF=9 hours in its 1800’ node cluster), packets loss, network disconnections, congestions and other accidents. Thus, high availability and performance requirements demand the use of data replication.

However, necessity to run several data replicas increases the overall power, consumed by a system. The more replicas we run, the better availability can be achieved at higher energy expenses. Moreover, the more replicas we invoke simultaneously to increase data consistency, the more energy is spent in addition for parallel request processing and transferring larger amount of data over the network.

Developers of large-scale Internet-systems and distributed data storages need to understand energy overheads and have to care about additional delays and their uncertainty. Similarly, providing consistency among replicas is another major issue in distributed systems.

Some of existing BigData solutions like Cassandra NoSQL, originated by Facebook, provide a possibility to choose a desired consistency level. However, there are no known tools predicting system latency depending on the provided consistency level, number of replicas, and other essential systems parameters.

Project concept and architecture

The purpose of the project is to develop a framework allowing developers of large-scale distributed systems to interplay between Energy Consumption, Latency, Availability, Durability and Consistency of distributed data storages like Cassandra NoSQL.

  1. Cassandra trade-off models and assessment tools. We will provide a set of tools allowing to quantitatively estimate and predict non-functional Cassandra properties (consistency, latency, energy consumption, availability, durability, etc.) by identifying fundamental trade-offs between them. Besides, we will provide a tool support for estimation of optimal settings of Cassandra cluster (replication factor, consistency level, timeout settings, caching strategy, etc.) taking into account desired system properties (e.g. response time, energy consumption, etc.).
  2. enhanced create/read/write/update APIs; we will enhance Cassandra API with adding better user control upon system properties while executing create/read/write/update requests. New strategies for data read/write/update will be introduced to minimise system response time.

Target audiences

  1. Companies operating and developing distributed Internet systems and data storages.
  2. Cassandra community.
  3. Facebook customers.

Sources of revenue

  1. Enhancement of flexibility, availability, performance, energy efficiency and other system properties of Facebook services.
  2. Providing commercial tools for Cassandra DB management and control.

Implementation stage

The research group runs extensive experiments investigating trade-offs between different system properties. Some experimental and theoretical results have been already reported in:

  1. Gorbenko, A., Tarasyuk, O., Romanovsky “Fault Tolerant Internet Computing: Benchmarking and modelling trade-offs between availability, latency and consistency”, Elsevier Journal of Network and Computer Applications. – 2019. – No146. – P. 1–14.
  2. A. Gorbenko, A. Romanovsky, “Timeouting Internet Services,” IEEE Security & Privacy, vol. 11(2), 2013
  3. O. Tarasyuk, A. Gorbenko, A. Romanovsky, “The Impact of Consistency on System Latency in Fault Tolerant Internet Computing,” Distributed Applications and Interoperable Systems, LNCS, vol. 9038, 2015.
  • Study With us

    Study for a research degree at Leeds Beckett and you'll join a thriving academic community in an inspiring and supportive environment. The Graduate School supports our increasingly active postgraduate research community and encourages students to make a difference to the university’s research culture and environment.

    The Graduate School
    Study With us
  • research with us

    Leeds Beckett University can conduct research on your behalf to help you to implement change and realise your business potential. Validating your ideas with academic evidence can be an essential part of winning contracts and fuelling business growth.

    Research for business
    research with us