Distributed storage systems are forced to select a trade-off between the incompatible goals Scalability, Consistency and Availability. No matter which coordination mechanism is used to create consistency, its choice greatly influences the systems performance regarding this trade-off. Furthermore, support for complex queries is highly desirable to support complex analyses without performance nightmares. However, previous proposals for distributed complex query processing are bound to a particular coordination mechanism, and are therefore difficult to adapt to new architectures.
In this thesis, the goal was to investigate distributed complex query processing independent of a particular coordination mechanism and network architecture. To this end, we have developed comprehensive abstractions of the components of distributed storage systems. Also, we show an abstract method for efficient distributed query processing within this abstract environment, MMQP. Here, query processing moves through the distributed system while constantly re-optimizing its remaining task. Through a stochastic analysis, we show how logarithmic performance with regard to the average path length and the transmission costs in the distributed system is possible.
To further enhance our understanding of this process, we have also performed a number of controlled experiments through simulation. A standard benchmark (TPC-H) is used for realistic data and complex queries requesting parts and aggregates of the data. The relationship between various parameters of our abstract models such as network size and the efficiency of distributed complex query processing is tested. We are able to confirm our theoretical predictions on efficiency.
In summary, we show how stochastically efficient distributed query processing is possible using only a very limited of assumptions about the network, and hope that our results allow the implementation of specific implementations for large-scale distributed storage systems.
Falls Ihr Browser eine Datei nicht öffnen kann, die Datei zuerst herunterladen und dann öffnen.