- Run a single or a batch of SPARQL queries
- Run SPARQL queries concurrently by emulating multiple clients
- Load/Store the statistics of SPARQL query optimizer
The command sparql <args>
can run SPARQL queries.
Note: Wukong requires a query plan provided either by user or by Wukong's planner for a given SPARQL query .
- Run a SPARQL query with user-defined plan
Use command sparql -f <fname> -p <pfname>
to run a SPARAQL query in the file (<fname>
) with a user-defined planning file (<pfname>
).
wukong> sparql -f sparql_query/lubm/basic/lubm_q7 -p sparql_query/lubm/basic/manual_plan/lubm_q7.fmt
INFO: Parsing a SPARQL query is done.
INFO: Parsing time: 215 usec
INFO: User-defined query plan is enabled
INFO: The query starts from an index vertex, you could use option -m to accelerate it.
INFO: (last) result size: 73
INFO: (average) latency: 7344 usec
- Run a SPARQL query with the plan generated by Wukong's planner
Use command sparql -f <fname>
to run a SPARQL query in the file (<fname>
) with a plan generated by Wukong's planner.
Note: please set
global_enable_planner
as1
(on) to enable Wukong's planner.
wukong> sparql -f sparql_query/lubm/basic/lubm_q7
INFO: Parsing a SPARQL query is done.
INFO: Parsing time: 138 usec
INFO: Optimization time: 2331 usec
INFO: The query starts from an index vertex, you could use option -m to accelerate it.
INFO: (last) result size: 73
INFO: (average) latency: 5128 usec
Use command sparql -b <fname>
to run a batch of SPARQL queries in the file (<fname>
).
wukong> sparql -b sparql_query/lubm/batch/batch_q1
INFO: Batch-mode start ...
Run the command: sparql -f sparql_query/lubm/basic/lubm_q1
INFO: Parsing a SPARQL query is done.
INFO: Parsing time: 131 usec
...
INFO: Batch-mode end.
- Add
-m <factor>
option to set the multi-threading<factor>
(the number of threads) for heavy (non-selective) queries.
wukong> sparql -f sparql_query/lubm/basic/lubm_q2 -m 6
INFO: Parsing a SPARQL query is done.
INFO: Parsing time: 109 usec
INFO: Optimization time: 20 usec
INFO: (last) result size: 1889
INFO: (average) latency: 2038 usec
- Add
-n <num>
option to repetitively run a SPARQL query in<num>
times.
wukong> sparql -f sparql_query/lubm/basic/lubm_q2 -m 2 -n 10
INFO: Parsing a SPARQL query is done.
INFO: Parsing time: 28 usec
INFO: Optimization time: 26 usec
INFO: (last) result size: 1889
INFO: (average) latency: 258 usec
- Add
-N <num>
option to generate the query plan in<num>
times by Wukong's planner.
wukong> sparql -f sparql_query/lubm/basic/lubm_q2 -m 2 -N 2
INFO: Parsing a SPARQL query is done.
INFO: Parsing time: 91 usec
INFO: Optimization time: 10 usec
INFO: (last) result size: 1889
INFO: (average) latency: 2802 usec
- Add
-v <lines>
option to show first<lines>
lines of results
wukong> sparql -f sparql_query/lubm/basic/lubm_q2 -m 2 -v 2
INFO: Parsing a SPARQL query is done.
INFO: Parsing time: 96 usec
INFO: Optimization time: 19 usec
INFO: (last) result size: 1889
INFO: The first 2 rows of results:
1: <http://www.Department6.University1.edu/Course36> "Course36"
2: <http://www.Department6.University1.edu/Course44> "Course44"
INFO: (average) latency: 1812 usec
- Add
-o <fname>
option to store the results into a file (<fname>
).
wukong> sparql -f sparql_query/lubm/basic/lubm_q2 -m 2 -o result_file
INFO: Parsing a SPARQL query is done.
INFO: Parsing time: 99 usec
INFO: Optimization time: 19 usec
INFO: (last) result size: 1889
INFO: (average) latency: 1868 usec
- Add
-g
option to run a (heavy) SPARQL query by using GPU.
wukong> sparql -f sparql_query/lubm/basic/lubm_q1 -g
INFO: Parsing a SPARQL query is done.
INFO: Parsing time: 121 usec
INFO: Optimization time: 907 usec
INFO: Leverage GPU to accelerate query processing.
INFO: (last) result size: 106
INFO: (average) latency: 1355 usec
The command sparql-emu <args>
can emulate multiple clients to concurrently send SPARQL queries (for evaluating the throughput).
Use command sparql-emu -f <fname>
to concurrently run SPARQL workloads defined in a configuration file (<fname>
) by multiple emulated clients. Note that the number of emulated clients on each server is equal to the number of proxies defined by global_num_proxies
wukong> sparql-emu -f sparql_query/lubm/emulator/mix_config
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#GraduateCourse> has 43070 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#AssistantProfessor> has 7624 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Department> has 799 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Department> has 799 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Department> has 799 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#University> has 1000 candidates
INFO: Throughput: 64.9971K queries/sec
INFO: [1sec]
INFO: ...
INFO: [9sec]
INFO: Throughput: 69.4413K queries/sec
INFO: Throughput: 69.0693K queries/sec
INFO: Per-query CDF graph
INFO: CDF Res:
INFO: P Q1 Q2 Q3 Q4 Q5 Q6
INFO: 1 64 64 121 72 65 63
...
INFO: 99 422 424 506 438 427 433
INFO: 100 6615 13317 1827 3303 6613 3315
INFO: Throughput: 69.0326K queries/sec
The configuration file (<fname>
) for SPARQL workload is shown as following.
$lights $heavies
$template-query $load_ratio
...
$template-query $load_ratio
- The
$lights
denotes the number of the types of light (selective) queries. The$heavies
denotes the number of the types of heavy (non-selective) queries. - Next, there are
$lights+$heavies
lines. Each line is a template file$template-query
with a ratio in the workload ($load_ratio
). - Each
template-query
file contains a SPARQL template, which uses%
to describe a class of subject or object. All of the candidates in this class will be randomly choosed to fill the template for generating massive SPARQL queries. For example, as shown in the following template file,%ub:AssistantProfessor
describe all the candidates that has the typeub:AssistantProfessor
(e.g., ub:AssistantProfessor1, ub:AssistantProfessor2, et al.).
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ub: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#>
SELECT ?X WHERE {
?X ub:publicationAuthor %ub:AssistantProfessor .
?X rdf:type ub:Publication .
}
- Add
-p <fname>
option to adopt user-defined query plans from file<fname>
for running SPARQL queries.
wukong> sparql-emu -f sparql_query/lubm/emulator/mix_config -p sparql_query/lubm/emulator/plan_config
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#GraduateCourse> has 43070 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#AssistantProfessor> has 7624 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Department> has 799 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Department> has 799 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Department> has 799 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#University> has 1000 candidates
INFO: Throughput: 74.7658K queries/sec
INFO: Throughput: 78.0418K queries/sec
INFO: [1sec]
INFO: ...
INFO: [9sec]
INFO: Throughput: 78.5904K queries/sec
INFO: Throughput: 78.09K queries/sec
INFO: Per-query CDF graph
INFO: CDF Res:
INFO: P Q1 Q2 Q3 Q4 Q5 Q6
...
INFO: 100 13431 13289 13444 13224 13433 1260
INFO: Throughput: 78.4291K queries/sec
- Add
-d <sec>
option to run<sec>
seconds on sparql-emu (default: 10).
wukong> sparql-emu -f sparql_query/lubm/emulator/mix_config -d 6
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#GraduateCourse> has 43070 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#AssistantProfessor> has 7624 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Department> has 799 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Department> has 799 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Department> has 799 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#University> has 1000 candidates
INFO: Throughput: 64.6384K queries/sec
INFO: [1sec]
INFO: ...
INFO: [5sec]
INFO: Throughput: 68.8448K queries/sec
INFO: Throughput: 68.8662K queries/sec
INFO: Per-query CDF graph
INFO: CDF Res:
INFO: P Q1 Q2 Q3 Q4 Q5 Q6
INFO: 1 64 65 123 73 65 64
INFO: ...
INFO: 100 2550 2555 1963 2560 2568 1288
INFO: Throughput: 68.9155K queries/sec
- Add
-w <sec>
option to warmup<sec>
seconds before evaluating Wukong (default: 5).
wukong> sparql-emu -f sparql_query/lubm/emulator/mix_config -w 1
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#GraduateCourse> has 43070 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#AssistantProfessor> has 7624 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Department> has 799 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Department> has 799 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Department> has 799 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#University> has 1000 candidates
INFO: Throughput: 64.9577K queries/sec
INFO: Throughput: 68.8247K queries/sec
INFO: [1sec]
INFO: ...
INFO: [9sec]
INFO: Throughput: 68.9033K queries/sec
INFO: Throughput: 68.9042K queries/sec
INFO: Per-query CDF graph
INFO: CDF Res:
INFO: P Q1 Q2 Q3 Q4 Q5 Q6
INFO: 1 64 64 121 72 65 61
INFO: 5 87 87 138 92 87 86
INFO: 10 102 102 149 107 102 102
INFO: ...
INFO: 100 13373 13369 13374 13347 13327 4944
INFO: Throughput: 69.0021K queries/sec
- Add
-n <num>
option to keep<num>
queries being processed (flying) during evaluating Wukong (default: 20).
wukong> sparql-emu -f sparql_query/lubm/emulator/mix_config -n 5
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#GraduateCourse> has 43070 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#AssistantProfessor> has 7624 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Department> has 799 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Department> has 799 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#Department> has 799 candidates
INFO: Parsing a SPARQL template is done.
INFO: <http://swat.cse.lehigh.edu/onto/univ-bench.owl#University> has 1000 candidates
INFO: Throughput: 46.5694K queries/sec
INFO: Throughput: 49.713K queries/sec
INFO: [1sec]
INFO: ...
INFO: [9sec]
INFO: Throughput: 49.9632K queries/sec
INFO: Throughput: 49.8509K queries/sec
INFO: Per-query CDF graph
INFO: CDF Res:
INFO: P Q1 Q2 Q3 Q4 Q5 Q6
INFO: 1 46 47 105 58 48 45
INFO: ...
INFO: 100 4388 1641 4403 1205 2161 1629
INFO: Throughput: 49.7075K queries/sec
The command load-stat -f <fname>
can load statistics of SPARQL query optimizer from file <fname>
for current graph data.
wukong> load-stat -f path/to/input/id_lubm_40/statfile
INFO: 1 ms for loading statistics at server 0
The command load-stat -f <fname>
can store the statistics of SPARQL query optimizer to file <fname>
for current graph data.
wukong> store-stat -f newStatfile
INFO: store statistics to file newStatfile is finished.
The command load
can load data into graph store.
- Use command
load -d <dname>
to dynamically load new graph data from all files in the directory (<dname>
).
wukong> load -d path/to/input/id_lubm_2/
INFO: loading ID-mapping file: path/to/input/id_lubm_2/str_normal
INFO: loading ID-mapping file: path/to/input/id_lubm_2/str_index
INFO: 2 data files and 0 attribute files found in directory (path/to/input/id_lubm_2/) at server 0
INFO: load 205474 triples from file path/to/input/id_lubm_2/id_uni0.nt at server 0
INFO: load 267198 triples from file path/to/input/id_lubm_2/id_uni1.nt at server 0
INFO: #0: 72ms for inserting into gstore
INFO: (average) latency: 106384 usec
- Add
-c
option to check and skip duplicated triples in the dataset.
wukong> load -c -d path/to/input/id_lubm_2/
INFO: loading ID-mapping file: path/to/input/id_lubm_2/str_normal
INFO: loading ID-mapping file: path/to/input/id_lubm_2/str_index
INFO: 2 data files and 0 attribute files found in directory (path/to/input/id_lubm_2/) at server 0
INFO: load 205474 triples from file path/to/input/id_lubm_2/id_uni0.nt at server 0
INFO: load 267198 triples from file path/to/input/id_lubm_2/id_uni1.nt at server 0
INFO: #0: 93ms for inserting into gstore
INFO: (average) latency: 160743 usec
Note: please enable dynamic graph store by setting
-USE_DYNAMIC_GSTORE=ON
as1
(on). Otherwise you will get an error message like:
wukong> load -d path/to/input/id_lubm_2/
ERROR: Can't load data into static graph store.
ERROR: You can enable it by building Wukong with -DUSE_DYNAMIC_GSTORE=ON.
The command gsck
can check the integrity of graph store.
- Use command
gsck
to check the integrity of both index and normal vertices in the graph store.
wukong> gsck
INFO: Graph storage intergity check has started on server 0
INFO: Server#0 already check 5%
...
INFO: Server#0 already check 95%
INFO: Server#0 has checked 49 index vertices and 220128 normal vertices.
INFO: (average) latency: 3436217 usec
- Add
-i
option to check the integrity of index vertcies only.
wukong> gsck -i
INFO: Graph storage intergity check has started on server 0
INFO: Server#0 already check 5%
...
INFO: Server#0 already check 95%
INFO: Server#0 has checked 49 index vertices and 0 normal vertices.
INFO: (average) latency: 3081938 usec
- Add
-n
option to check the integrity of normal vertices only.
wukong> gsck -n
INFO: Graph storage intergity check has started on server 0
INFO: Server#0 already check 5%
...
INFO: Server#0 already check 95%
INFO: Server#0 has checked 0 index vertices and 220128 normal vertices.
INFO: (average) latency: 2094172 usec
The command config <args>
can configure Wukong in runtime and list the current configuration of Wukong.
- Show the current configuration of Wukong
Use command config -v
to list the current setup of Wukong.
wukong> config -v
------ global configurations ------
the number of proxies: 1
the number of engines: 2
global_input_folder: path/to/input/id_lubm_2
global_memstore_size_gb: 2
global_est_load_factor: 55
global_data_port_base: 5500
global_ctrl_port_base: 9576
global_rdma_buf_size_mb: 0
global_rdma_rbf_size_mb: 0
global_use_rdma: 0
global_enable_caching: 0
global_enable_workstealing: 0
global_stealing_pattern: 0
global_rdma_threshold: 300
global_mt_threshold: 2
global_silent: 0
global_enable_planner: 0
global_generate_statistics: 0
global_enable_vattr: 1
global_num_gpus: 0
global_gpu_rdma_buf_size_mb: 0
global_gpu_rbuf_size_mb: 32
global_gpu_kvcache_size_gb: 10
global_gpu_key_blk_size_mb: 16
global_gpu_value_blk_size_mb: 4
global_gpu_enable_pipeline: 1
--
the number of servers: 1
the number of threads: 3
- Change the configuration of Wukong
Use command config -s <str>
to configure Wukong by string <str>
.
Note: You can use
&
to set several configurations in one string (e.g., item1=val1&item2=...).
wukong> config -s global_mt_threshold=1
wukong> config -s global_mt_threshold=2&global_silent=1
- Load the configuration of Wukong
Use command config -l <fname>
to configure Wukong by loading configuration file <fname>
.
wukong> config -l newConfigFile
- Get help message about
config
Use command config -h
or config --help
to print help message of command config
.
wukong> config -h
config <args> run commands for configueration:
-v print current config
-l <fname> load config items from <fname>
-s <string> set config items by <str> (e.g., item1=val1&item2=...)
-h [ --help ] help message about config
The command logger <args>
can configure log-level and check current log-level.
- The log levels in Wukong
Wukong provides 7 log levels and the current log-level configuration controls the message printing in Wukong.
Log-Level | Name | Description |
---|---|---|
0 | LOG_EVERYTHING | Log everything |
1 | LOG_DEBUG | Log debug information |
2 | LOG_INFO | Log general useful information. |
3 | LOG_EMPH | Log important information |
4 | LOG_WARNING | Log warning conditions. |
5 | LOG_ERROR | Log recoverable conditions |
6 | LOG_FATAL | Log fatal and probably irrecoverable conditions |
7 | LOG_NONE | Log nothing |
- Show the current log-level
Use command logger -v
to print current configuration of log level.
wukong> logger -v
loglevel: 2 (INFO)
- Change the log-level
Use command logger -s <level>
to switch current log level to <level>
.
wukong> logger -s 0
set loglevel to 0 (EVERYTHING)
wukong> logger -s 1
set loglevel to 1 (DEBUG)
wukong> logger -s 2
set loglevel to 2 (INFO)
wukong> logger -s 3
set loglevel to 3 (EMPH)
wukong> logger -s 4
set loglevel to 4 (WARNING)
wukong> logger -s 5
set loglevel to 5 (ERROR)
wukong> logger -s 6
set loglevel to 6 (FATAL)
wukong> logger -s 7
set loglevel to 7 (NONE)
- Get help message about
logger
wukong> logger -h
logger <args> run commands for logger:
-v print loglevel
-s <level> set loglevel to <level> (e.g., DEBUG=1, INFO=2,
WARNING=4, ERROR=5)
-h [ --help ] help message about logger
The command help
lists the usage of commands in Wukong.
wukong> help
These are common Wukong commands: :
help display help infomation:
quit quit from the console:
config <args> run commands for configueration:
-v print current config
-l <fname> load config items from <fname>
-s <string> set config items by <str> (e.g., item1=val1&item2=...)
-h [ --help ] help message about config
logger <args> run commands for logger:
-v print loglevel
-s <level> set loglevel to <level> (e.g., DEBUG=1, INFO=2,
WARNING=4, ERROR=5)
-h [ --help ] help message about logger
sparql <args> run SPARQL queries in single or batch mode:
-f <fname> run a [single] SPARQL query from <fname>
-m <factor> (=1) set multi-threading <factor> for heavy query
processing
-n <num> (=1) repeat query processing <num> times
-p <fname> adopt user-defined query plan from <fname> for running
a single query
-N <num> (=1) do query optimization <num> times
-v <lines> (=0) print at most <lines> of results
-o <fname> output results info <fname>
-g leverage GPU to accelerate heavy query processing
-b <fname> run a [batch] of SPARQL queries configured by <fname>
-h [ --help ] help message about sparql
sparql-emu <args> emulate clients to continuously send SPARQL queries:
-f <fname> run queries generated from temples configured by
<fname>
-p <fname> adopt user-defined query plans from <fname> for
running queries
-d <sec> (=10) eval <sec> seconds (default: 10)
-w <sec> (=5) warmup <sec> seconds (default: 5)
-n <num> (=20) keep <num> queries being processed (default: 20)
-h [ --help ] help message about sparql-emu
load <args> load RDF data into dynamic (in-memmory) graph store:
-d <dname> load data from directory <dname>
-c check and skip duplicate RDF triples
-h [ --help ] help message about load
gsck <args> check the integrity of (in-memmory) graph storage:
-i check from index key/value pair to normal key/value
pair
-n check from normal key/value pair to index key/value
pair
-h [ --help ] help message about gsck
load-stat load statistics of SPARQL query optimizer:
-f <fname> load statistics from <fname> located at data folder
-h [ --help ] help message about load-stat
store-stat store statistics of SPARQL query optimizer:
-f <fname> store statistics to <fname> located at data folder
-h [ --help ] help message about store-stat
The command quit
can terminate Wukong (not just console).
wukong> quit
...