Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use download thread to speed up result retrieval #280

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mdesmet
Copy link
Contributor

@mdesmet mdesmet commented Nov 3, 2022

Description

Use a separate thread for downloading and do parsing in the main thread.

Non-technical explanation

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:

* Improved result retrieval performance by using a download thread to allow for simultaneous mapping and downloading of results

@cla-bot cla-bot bot added the cla-signed label Nov 3, 2022
@mdesmet mdesmet force-pushed the feature/multithread-fetch branch 2 times, most recently from 3a5e182 to 9b3fb71 Compare January 7, 2023 11:39
@mdesmet mdesmet force-pushed the feature/multithread-fetch branch from 9b3fb71 to 15de687 Compare January 7, 2023 11:41
@mdesmet
Copy link
Contributor Author

mdesmet commented Jan 7, 2023

Performance tests:

TPC-current prod

1-1-False (iterations-threads-experimental_python_types)

-------------------------------------------------------------------------------------------------------------------------- benchmark: 42 tests --------------------------------------------------------------------------------------------------------------------------
Name (time in ms)                                                                         Min                    Max                   Mean                StdDev                 Median                   IQR            Outliers      OPS            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test[SELECT col_bool FROM memory.default.orders LIMIT 100000   -1-1-False]            59.4271 (1.0)          81.9554 (1.16)         66.5156 (1.02)         6.8402 (5.33)         64.4523 (1.0)          9.5899 (5.96)          3;0  15.0341 (0.98)         15           1
test[SELECT col_inf FROM memory.default.orders LIMIT 100000    -1-1-False]            59.6405 (1.00)         70.5245 (1.0)          65.1079 (1.0)          2.6803 (2.09)         64.7778 (1.01)         2.8484 (1.77)          5;0  15.3591 (1.0)          15           1
test[SELECT col_decimal FROM memory.default.orders LIMIT 100000-1-1-False]            76.8291 (1.29)         88.3533 (1.25)         82.0259 (1.26)         3.0315 (2.36)         82.3237 (1.28)         2.9677 (1.84)          2;1  12.1913 (0.79)         11           1
test[SELECT col_varchar FROM memory.default.orders LIMIT 100000-1-1-False]            81.6312 (1.37)         96.8759 (1.37)         88.2434 (1.36)         4.7039 (3.66)         87.9213 (1.36)         8.1224 (5.05)          3;0  11.3323 (0.74)         12           1
test[SELECT col_bool FROM memory.default.orders LIMIT 100000   -1-1-True]             98.4950 (1.66)        104.2865 (1.48)        100.9127 (1.55)         1.9941 (1.55)        100.5311 (1.56)         3.5320 (2.20)          4;0   9.9096 (0.65)         10           1
test[SELECT col_inf FROM memory.default.orders LIMIT 100000    -1-1-True]            102.9357 (1.73)        106.9619 (1.52)        105.7628 (1.62)         1.2842 (1.0)         106.0175 (1.64)         1.6086 (1.0)           1;0   9.4551 (0.62)         10           1
test[SELECT col_int FROM memory.default.orders LIMIT 100000    -1-1-False]           107.1425 (1.80)        112.7703 (1.60)        110.0202 (1.69)         1.7583 (1.37)        109.7011 (1.70)         1.7709 (1.10)          3;0   9.0892 (0.59)          9           1
test[SELECT col_real FROM memory.default.orders LIMIT 100000   -1-1-False]           107.5565 (1.81)        120.4992 (1.71)        114.7358 (1.76)         4.9118 (3.82)        114.1210 (1.77)         8.7792 (5.46)          3;0   8.7157 (0.57)          9           1
test[SELECT col_varchar FROM memory.default.orders LIMIT 100000-1-1-True]            124.6601 (2.10)        155.3064 (2.20)        131.1051 (2.01)        10.1347 (7.89)        126.9830 (1.97)         5.8412 (3.63)          1;1   7.6275 (0.50)          8           1
test[SELECT col_decimal FROM memory.default.orders LIMIT 100000-1-1-True]            129.3154 (2.18)        136.6356 (1.94)        132.8952 (2.04)         2.7708 (2.16)        133.9422 (2.08)         4.7896 (2.98)          3;0   7.5247 (0.49)          8           1
test[SELECT col_int FROM memory.default.orders LIMIT 100000    -1-1-True]            146.7090 (2.47)        157.1980 (2.23)        149.4220 (2.29)         3.5802 (2.79)        148.4073 (2.30)         2.1559 (1.34)          1;1   6.6925 (0.44)          7           1
test[SELECT col_real FROM memory.default.orders LIMIT 100000   -1-1-True]            153.0150 (2.57)        166.0615 (2.35)        157.8911 (2.43)         4.1976 (3.27)        156.5963 (2.43)         3.3517 (2.08)          2;1   6.3335 (0.41)          7           1
test[SELECT col_double FROM memory.default.orders LIMIT 100000 -1-1-False]           175.3770 (2.95)        211.1018 (2.99)        184.5517 (2.83)        13.3958 (10.43)       179.8046 (2.79)         6.7865 (4.22)          1;1   5.4185 (0.35)          6           1
test[SELECT col_date FROM memory.default.orders LIMIT 100000   -1-1-False]           175.7025 (2.96)        210.0197 (2.98)        189.5205 (2.91)        15.1884 (11.83)       185.0171 (2.87)        26.6087 (16.54)         1;0   5.2765 (0.34)          5           1
test[SELECT col_date FROM memory.default.orders LIMIT 100000   -1-1-True]            210.9910 (3.55)        238.3288 (3.38)        217.8451 (3.35)        11.5684 (9.01)        214.0298 (3.32)         9.4404 (5.87)          1;1   4.5904 (0.30)          5           1
test[SELECT col_double FROM memory.default.orders LIMIT 100000 -1-1-True]            213.9959 (3.60)        223.6658 (3.17)        218.1640 (3.35)         3.6315 (2.83)        217.2803 (3.37)         4.5981 (2.86)          2;0   4.5837 (0.30)          5           1
test[SELECT col_row FROM memory.default.orders LIMIT 100000    -1-1-False]           214.7377 (3.61)        266.7350 (3.78)        236.4905 (3.63)        18.8821 (14.70)       233.0995 (3.62)        16.9282 (10.52)         2;0   4.2285 (0.28)          5           1
test[SELECT col_time FROM memory.default.orders LIMIT 100000   -1-1-False]           304.7335 (5.13)        316.3699 (4.49)        311.6597 (4.79)         5.3691 (4.18)        314.0307 (4.87)         9.5768 (5.95)          1;0   3.2086 (0.21)          5           1
test[SELECT col_array FROM memory.default.orders LIMIT 100000  -1-1-False]           309.7056 (5.21)        364.1924 (5.16)        347.2819 (5.33)        21.8120 (16.98)       355.0323 (5.51)        21.9491 (13.64)         1;0   2.8795 (0.19)          5           1
test[SELECT col_ts_tz FROM memory.default.orders LIMIT 100000  -1-1-False]           311.7995 (5.25)        363.4070 (5.15)        328.3063 (5.04)        20.4396 (15.92)       321.3865 (4.99)        20.4429 (12.71)         1;0   3.0459 (0.20)          5           1
test[SELECT col_row FROM memory.default.orders LIMIT 100000    -1-1-True]            314.9400 (5.30)        324.4890 (4.60)        319.6844 (4.91)         3.8097 (2.97)        318.7735 (4.95)         5.9481 (3.70)          2;0   3.1281 (0.20)          5           1
test[SELECT col_ts FROM memory.default.orders LIMIT 100000     -1-1-False]           320.5561 (5.39)        343.1700 (4.87)        330.0500 (5.07)         8.1791 (6.37)        328.4937 (5.10)         6.5783 (4.09)          2;1   3.0298 (0.20)          5           1
test[SELECT col_time_tz FROM memory.default.orders LIMIT 100000-1-1-False]           366.8443 (6.17)        391.1040 (5.55)        379.9227 (5.84)        12.0651 (9.39)        385.5613 (5.98)        22.7483 (14.14)         2;0   2.6321 (0.17)          5           1
test[SELECT col_array FROM memory.default.orders LIMIT 100000  -1-1-True]            401.9652 (6.76)        419.3676 (5.95)        409.3242 (6.29)         8.9086 (6.94)        404.6502 (6.28)        16.8377 (10.47)         2;0   2.4431 (0.16)          5           1
test[SELECT col_ts FROM memory.default.orders LIMIT 100000     -1-1-True]            504.7970 (8.49)        573.4160 (8.13)        530.4291 (8.15)        26.9747 (21.00)       528.3956 (8.20)        35.0343 (21.78)         1;0   1.8853 (0.12)          5           1
test[SELECT col_time FROM memory.default.orders LIMIT 100000   -1-1-True]            571.7212 (9.62)        609.1631 (8.64)        598.2309 (9.19)        15.1205 (11.77)       604.5553 (9.38)        12.6175 (7.84)          1;1   1.6716 (0.11)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 100000              -1-1-False]           670.8002 (11.29)       712.0598 (10.10)       694.0826 (10.66)       15.6382 (12.18)       698.6392 (10.84)       20.4841 (12.73)         2;0   1.4408 (0.09)          5           1
test[SELECT col_ts_tz FROM memory.default.orders LIMIT 100000  -1-1-True]            685.7413 (11.54)       716.4478 (10.16)       699.4769 (10.74)       12.0841 (9.41)        697.2590 (10.82)       18.3879 (11.43)         2;0   1.4296 (0.09)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 100000              -1-1-True]            757.3809 (12.74)       791.0125 (11.22)       768.5786 (11.80)       13.0885 (10.19)       766.1416 (11.89)       12.0679 (7.50)          1;1   1.3011 (0.08)          5           1
test[SELECT col_map FROM memory.default.orders LIMIT 100000    -1-1-False]           766.4541 (12.90)       840.0774 (11.91)       809.7443 (12.44)       28.2845 (22.02)       817.3744 (12.68)       37.7758 (23.48)         2;0   1.2350 (0.08)          5           1
test[SELECT col_time_tz FROM memory.default.orders LIMIT 100000-1-1-True]            819.5480 (13.79)       854.9526 (12.12)       835.4975 (12.83)       12.8810 (10.03)       834.2350 (12.94)       14.3303 (8.91)          2;0   1.1969 (0.08)          5           1
test[SELECT col_map FROM memory.default.orders LIMIT 100000    -1-1-True]            835.9971 (14.07)       883.0877 (12.52)       858.5483 (13.19)       19.8094 (15.43)       861.5505 (13.37)       33.9565 (21.11)         2;0   1.1648 (0.08)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 100000              -1-10-False]        3,047.4722 (51.28)     3,557.9346 (50.45)     3,325.5598 (51.08)      186.5241 (145.24)    3,353.9351 (52.04)      215.5850 (134.02)        2;0   0.3007 (0.02)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 100000              -1-10-True]         4,420.1128 (74.38)     4,949.5447 (70.18)     4,671.1189 (71.74)      230.0654 (179.15)    4,725.5806 (73.32)      402.5186 (250.23)        2;0   0.2141 (0.01)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 100000              -10-1-False]        6,838.3011 (115.07)    7,285.2625 (103.30)    7,018.6801 (107.80)     168.1544 (130.94)    6,993.0765 (108.50)     200.5396 (124.67)        2;0   0.1425 (0.01)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 1000000             -1-1-False]         6,881.1385 (115.79)    7,195.6851 (102.03)    7,051.7454 (108.31)     134.8355 (104.99)    7,101.3164 (110.18)     229.3011 (142.55)        2;0   0.1418 (0.01)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 1000000             -1-1-True]          7,504.4615 (126.28)    7,935.6867 (112.52)    7,667.6439 (117.77)     178.4886 (138.99)    7,588.5837 (117.74)     266.4591 (165.65)        1;0   0.1304 (0.01)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 100000              -10-1-True]         7,717.4880 (129.86)    7,807.0009 (110.70)    7,774.7322 (119.41)      37.6305 (29.30)     7,795.6273 (120.95)      54.2180 (33.71)         1;0   0.1286 (0.01)          5           1
test[SELECT 1                                                  -1000-1-False]     21,142.2548 (355.77)   24,944.4932 (353.70)   23,327.6313 (358.29)   1,378.8910 (>1000.0)  23,465.9452 (364.08)   1,291.9243 (803.14)        2;0   0.0429 (0.00)          5           1
test[SELECT 1                                                  -1000-1-True]      22,148.3341 (372.70)   22,593.8631 (320.37)   22,384.9494 (343.81)     196.1875 (152.77)   22,313.2727 (346.20)     334.5015 (207.95)        3;0   0.0447 (0.00)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 10000000            -1-1-False]        64,358.9783 (>1000.0)  65,323.2422 (926.25)   64,911.5142 (996.98)     352.0023 (274.10)   65,001.6393 (>1000.0)    354.3119 (220.26)        2;0   0.0154 (0.00)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 10000000            -1-1-True]         73,814.5492 (>1000.0)  74,326.2205 (>1000.0)  73,991.3885 (>1000.0)    218.5020 (170.14)   73,859.9247 (>1000.0)    311.4866 (193.64)        1;0   0.0135 (0.00)          5           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

TPC - Download thread

-------------------------------------------------------------------------------------------------------------------------- benchmark: 42 tests --------------------------------------------------------------------------------------------------------------------------
Name (time in ms)                                                                         Min                    Max                   Mean                StdDev                 Median                   IQR            Outliers      OPS            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test[SELECT col_inf FROM memory.default.orders LIMIT 100000    -1-1-False]            56.3475 (1.0)          72.1225 (1.0)          60.7453 (1.0)          5.0582 (2.00)         58.5958 (1.0)          7.1377 (3.31)          3;0  16.4622 (1.0)          15           1
test[SELECT col_bool FROM memory.default.orders LIMIT 100000   -1-1-False]            56.9799 (1.01)         72.9136 (1.01)         61.3053 (1.01)         4.0499 (1.60)         60.5551 (1.03)         4.2449 (1.97)          4;1  16.3118 (0.99)         16           1
test[SELECT col_decimal FROM memory.default.orders LIMIT 100000-1-1-False]            73.6358 (1.31)         87.2896 (1.21)         78.8749 (1.30)         4.4721 (1.77)         78.3329 (1.34)         5.2479 (2.43)          3;0  12.6783 (0.77)         11           1
test[SELECT col_varchar FROM memory.default.orders LIMIT 100000-1-1-False]            81.2112 (1.44)         90.2028 (1.25)         85.5449 (1.41)         2.5288 (1.0)          85.5285 (1.46)         2.6437 (1.23)          4;0  11.6898 (0.71)         12           1
test[SELECT col_bool FROM memory.default.orders LIMIT 100000   -1-1-True]             97.5207 (1.73)        125.6030 (1.74)        103.2522 (1.70)         8.2410 (3.26)        100.6374 (1.72)         4.4740 (2.07)          1;1   9.6850 (0.59)         10           1
test[SELECT col_inf FROM memory.default.orders LIMIT 100000    -1-1-True]            100.8161 (1.79)        124.3672 (1.72)        107.3958 (1.77)         7.2747 (2.88)        103.9435 (1.77)         7.5784 (3.51)          1;1   9.3113 (0.57)         10           1
test[SELECT col_int FROM memory.default.orders LIMIT 100000    -1-1-False]           105.1758 (1.87)        119.4839 (1.66)        110.7264 (1.82)         4.4726 (1.77)        110.7116 (1.89)         6.3437 (2.94)          4;0   9.0313 (0.55)         10           1
test[SELECT col_real FROM memory.default.orders LIMIT 100000   -1-1-False]           106.7277 (1.89)        154.9778 (2.15)        114.8974 (1.89)        15.2694 (6.04)        109.0143 (1.86)         5.2195 (2.42)          1;1   8.7034 (0.53)          9           1
test[SELECT col_decimal FROM memory.default.orders LIMIT 100000-1-1-True]            108.8172 (1.93)        116.4240 (1.61)        113.6501 (1.87)         2.6885 (1.06)        114.2714 (1.95)         4.2404 (1.97)          2;0   8.7989 (0.53)          8           1
test[SELECT col_varchar FROM memory.default.orders LIMIT 100000-1-1-True]            110.7422 (1.97)        118.9090 (1.65)        116.2907 (1.91)         2.6913 (1.06)        117.1339 (2.00)         2.4091 (1.12)          2;1   8.5991 (0.52)          9           1
test[SELECT col_int FROM memory.default.orders LIMIT 100000    -1-1-True]            144.2331 (2.56)        153.0033 (2.12)        148.7910 (2.45)         2.8416 (1.12)        149.0766 (2.54)         3.4607 (1.60)          2;0   6.7208 (0.41)          7           1
test[SELECT col_real FROM memory.default.orders LIMIT 100000   -1-1-True]            156.0766 (2.77)        173.6147 (2.41)        160.3993 (2.64)         5.9819 (2.37)        158.4304 (2.70)         2.4798 (1.15)          1;1   6.2344 (0.38)          7           1
test[SELECT col_date FROM memory.default.orders LIMIT 100000   -1-1-False]           162.4096 (2.88)        211.6040 (2.93)        184.1605 (3.03)        19.0297 (7.53)        179.0322 (3.06)        27.1409 (12.58)         2;0   5.4300 (0.33)          5           1
test[SELECT col_double FROM memory.default.orders LIMIT 100000 -1-1-False]           172.8642 (3.07)        187.6813 (2.60)        176.3105 (2.90)         5.6506 (2.23)        174.5597 (2.98)         2.1578 (1.0)           1;1   5.6718 (0.34)          6           1
test[SELECT col_row FROM memory.default.orders LIMIT 100000    -1-1-False]           209.3211 (3.71)        225.5880 (3.13)        215.3792 (3.55)         6.9657 (2.75)        212.2280 (3.62)        11.0002 (5.10)          1;0   4.6430 (0.28)          5           1
test[SELECT col_double FROM memory.default.orders LIMIT 100000 -1-1-True]            212.0628 (3.76)        222.0341 (3.08)        215.8818 (3.55)         4.6792 (1.85)        213.3565 (3.64)         8.1800 (3.79)          1;0   4.6322 (0.28)          5           1
test[SELECT col_date FROM memory.default.orders LIMIT 100000   -1-1-True]            212.4795 (3.77)        256.6109 (3.56)        225.7077 (3.72)        17.8163 (7.05)        221.1361 (3.77)        17.1199 (7.93)          1;0   4.4305 (0.27)          5           1
test[SELECT col_row FROM memory.default.orders LIMIT 100000    -1-1-True]            247.8783 (4.40)        262.6342 (3.64)        252.2939 (4.15)         6.0853 (2.41)        250.3748 (4.27)         7.0423 (3.26)          1;0   3.9636 (0.24)          5           1
test[SELECT col_time FROM memory.default.orders LIMIT 100000   -1-1-False]           292.6225 (5.19)        311.5513 (4.32)        300.5854 (4.95)         7.4893 (2.96)        298.8763 (5.10)        11.1790 (5.18)          2;0   3.3268 (0.20)          5           1
test[SELECT col_ts_tz FROM memory.default.orders LIMIT 100000  -1-1-False]           316.5236 (5.62)        352.3160 (4.88)        334.0659 (5.50)        14.2427 (5.63)        336.3475 (5.74)        22.4041 (10.38)         2;0   2.9934 (0.18)          5           1
test[SELECT col_time_tz FROM memory.default.orders LIMIT 100000-1-1-False]           317.5971 (5.64)        329.8769 (4.57)        324.4003 (5.34)         4.9138 (1.94)        323.1370 (5.51)         7.2260 (3.35)          2;0   3.0826 (0.19)          5           1
test[SELECT col_ts FROM memory.default.orders LIMIT 100000     -1-1-False]           321.4981 (5.71)        360.5623 (5.00)        337.9481 (5.56)        14.1834 (5.61)        336.1507 (5.74)        13.1048 (6.07)          2;0   2.9590 (0.18)          5           1
test[SELECT col_array FROM memory.default.orders LIMIT 100000  -1-1-False]           327.5637 (5.81)        372.9950 (5.17)        343.7271 (5.66)        19.3416 (7.65)        337.0470 (5.75)        30.1307 (13.96)         1;0   2.9093 (0.18)          5           1
test[SELECT col_array FROM memory.default.orders LIMIT 100000  -1-1-True]            348.9893 (6.19)        365.2217 (5.06)        355.3199 (5.85)         7.6572 (3.03)        350.2910 (5.98)        12.9358 (5.99)          1;0   2.8144 (0.17)          5           1
test[SELECT col_ts FROM memory.default.orders LIMIT 100000     -1-1-True]            535.8305 (9.51)        547.5418 (7.59)        542.9078 (8.94)         4.7537 (1.88)        544.8300 (9.30)         7.0358 (3.26)          1;0   1.8419 (0.11)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 100000              -1-1-False]           644.7335 (11.44)       722.5782 (10.02)       679.7184 (11.19)       28.5072 (11.27)       681.1364 (11.62)       32.2130 (14.93)         2;0   1.4712 (0.09)          5           1
test[SELECT col_time FROM memory.default.orders LIMIT 100000   -1-1-True]            650.1813 (11.54)       695.6361 (9.65)        664.4635 (10.94)       18.6122 (7.36)        654.9983 (11.18)       21.5804 (10.00)         1;0   1.5050 (0.09)          5           1
test[SELECT col_ts_tz FROM memory.default.orders LIMIT 100000  -1-1-True]            679.4406 (12.06)       742.6969 (10.30)       702.3940 (11.56)       24.5208 (9.70)        701.1797 (11.97)       27.5419 (12.76)         1;0   1.4237 (0.09)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 100000              -1-1-True]            680.1243 (12.07)       699.9281 (9.70)        686.6925 (11.30)        7.9132 (3.13)        686.0557 (11.71)        8.8569 (4.10)          1;0   1.4563 (0.09)          5           1
test[SELECT col_map FROM memory.default.orders LIMIT 100000    -1-1-False]           771.7914 (13.70)       829.1764 (11.50)       792.2354 (13.04)       23.0122 (9.10)        788.2306 (13.45)       30.8323 (14.29)         1;0   1.2623 (0.08)          5           1
test[SELECT col_map FROM memory.default.orders LIMIT 100000    -1-1-True]            780.3485 (13.85)       810.3838 (11.24)       789.2438 (12.99)       12.0860 (4.78)        784.6089 (13.39)       10.4768 (4.86)          1;1   1.2670 (0.08)          5           1
test[SELECT col_time_tz FROM memory.default.orders LIMIT 100000-1-1-True]            797.3840 (14.15)       816.6131 (11.32)       809.6229 (13.33)        8.3706 (3.31)        812.8364 (13.87)       13.6315 (6.32)          1;0   1.2351 (0.08)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 100000              -1-10-True]         3,041.5533 (53.98)     3,277.6133 (45.45)     3,110.9412 (51.21)      102.6907 (40.61)     3,046.5480 (51.99)      133.2603 (61.76)         1;0   0.3214 (0.02)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 100000              -1-10-False]        3,251.6890 (57.71)     3,446.6940 (47.79)     3,338.8025 (54.96)       80.3881 (31.79)     3,335.5414 (56.92)      133.9719 (62.09)         2;0   0.2995 (0.02)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 1000000             -1-1-False]         6,330.9927 (112.36)    6,613.9482 (91.70)     6,436.0519 (105.95)     110.1593 (43.56)     6,404.8050 (109.30)     137.9425 (63.93)         1;0   0.1554 (0.01)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 1000000             -1-1-True]          6,431.4048 (114.14)    7,040.4470 (97.62)     6,630.7266 (109.16)     247.0878 (97.71)     6,528.5468 (111.42)     303.4302 (140.62)        1;0   0.1508 (0.01)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 100000              -10-1-False]        6,617.4086 (117.44)    7,100.7170 (98.45)     6,857.8892 (112.90)     208.5954 (82.49)     6,892.7798 (117.63)     370.3442 (171.63)        2;0   0.1458 (0.01)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 100000              -10-1-True]         6,904.8733 (122.54)    7,206.9193 (99.93)     7,031.1452 (115.75)     124.7728 (49.34)     7,030.5854 (119.98)     202.0546 (93.64)         2;0   0.1422 (0.01)          5           1
test[SELECT 1                                                  -1000-1-True]      17,822.1402 (316.29)   20,505.1147 (284.31)   18,591.1697 (306.05)   1,110.4000 (439.10)   18,121.3637 (309.26)   1,178.0420 (545.95)        1;0   0.0538 (0.00)          5           1
test[SELECT 1                                                  -1000-1-False]     23,274.5628 (413.05)   23,912.5884 (331.56)   23,605.6614 (388.60)     265.9247 (105.16)   23,582.0674 (402.45)     454.4689 (210.62)        2;0   0.0424 (0.00)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 10000000            -1-1-False]        62,392.2378 (>1000.0)  63,465.6746 (879.97)   62,888.4911 (>1000.0)    441.8606 (174.73)   62,982.5665 (>1000.0)    715.0867 (331.40)        2;0   0.0159 (0.00)          5           1
test[SELECT * FROM tpch.sf100.orders LIMIT 10000000            -1-1-True]         62,740.9363 (>1000.0)  65,014.5818 (901.45)   63,652.4882 (>1000.0)    831.9718 (329.00)   63,515.4361 (>1000.0)    696.4022 (322.74)        2;1   0.0157 (0.00)          5           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

@mdesmet mdesmet marked this pull request as ready for review January 7, 2023 11:44
@@ -684,6 +686,27 @@ def _verify_extra_credential(self, header):
raise ValueError(f"only ASCII characters are allowed in extra credential '{key}'")


class ResultDownloader():
def __init__(self):
self.queue: queue.Queue = queue.Queue()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the docs, the default maxsize is zero, which means unbounded. When I was testing similar changes in the Go driver, it turned out that the queue size didn't matter, so even a queue of size 1 is fine. Did you test different queue sizes, and if not, can you check if maxsize greater than 1 makes a notable difference?

@hashhar
Copy link
Member

hashhar commented Jan 9, 2023

There is a lot of noise in the benchmark results.

But ignoring that the results seem like a mixed bag, there is noticeable improvement for some data types/queries with legacy type mapping but the same types also show regressions as soon as type mapping is introduced - which seems a bit unintuitive.

Some regressions which we should look at, I think the ones with type mapping actually show that type mapping on main thread actually introduces some contention/waiting? The ones without type mapping regressing are the more interesting results.

  • col_bool with type mapping
  • col_data with type mapping
  • col_inf with type mapping
  • col_int no type mapping
  • col_real with type mapping
  • col_time with type mapping
  • col_ts no type mapping
  • col_ts with type mapping
  • col_ts_tz no type mapping
  • col_ts_tz with type mapping

As a side note there's no point to using the TPCH connector to benchmark, it's often bottlenecked by data generation. Using the memory connector is the correct way.

@hashhar
Copy link
Member

hashhar commented Jan 9, 2023

Oh, one more thing - are you testing with the server running locally or somewhere remote? Because the local network is so fast that you probably won't be able to measure any benefits of faster downloads.

@mdesmet mdesmet self-assigned this Jan 19, 2023
@bendemott
Copy link
Contributor

Python's Queue is deeply flawed and queue raising Empty is unreliable, source: I've done tons of multi-threaded programming in Python.

Also, the semaphore that the Queue class uses internally is slow. Queue is designed to protect the programmer from themselves when unknown objects are being synchronized.

Because of the Global Interpreter Lock, most methods of a list are thread safe.
pop, append len are Thread Safe - So, that being said, at least in CPython - you will get a 10x to 100x thread contention speedup by simply using a list - which would appear to work just fine for your use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants