项目作者: pigna90

项目描述 :
Laboratory on Algorithms for Big Data a.a. 2016/17 - University of Pisa
高级语言: C++
项目地址: git://github.com/pigna90/time_series_indexing.git
创建时间: 2016-10-03T08:14:10Z
项目社区:https://github.com/pigna90/time_series_indexing

开源协议:MIT License

下载


Time Series Indexing - LABD

This project has been made for educational purpose during lecture of [Laboratory on Algorithms for Big Data] ( University of Pisa ) 2016/17.

Libraries

  • Cereal - Serialization library
  • SDSL - Succinct Data Structures Library
  • G-Test - Google Unit Test

Tech

  • [CMake] - Family of tools designed to build, test and package software

Usage

Generate a makefile and build project:

  1. $ mkdir _build
  2. $ cd _build
  3. $ cmake ..
  4. $ make

The executables will be placed in _build/bin/.

Indices

There are two different implementations, each of them is represented by an index:

  • 0 : Baseline implementation
  • 1 : Implementation with succinct data structures

Data serialization

  1. $ cd _build/bin/
  2. $ ./build_index id path/to/dataset

The object is saved in the same folder where the input dataset is located, and named like datases+index.

Build query sets

  1. $ ./build_query_sets path/to/dataset num_of_query min_date_interval max_k

This one builds (min_date_interval\100)*max_k different set of queries each having a different size of Range and K.
If there is a serialized data structure with id=0, the creation will be faster because it won’t need to populate any structure from file.

Run queries

  1. $ ./run_queries id path/to/dataset path/to/query_set

Test

Test index=0 implementation:

  1. $ ./test_baseline

Test index=1 implementation:

  1. $ ./test_index1

[Laboratory on Algorithms for Big Data]: http://pages.di.unipi.it/rossano/221-2/laboratory-on-algorithms-for-big-data-a-a-201617

[CMake]: https://cmake.org