-
-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Still get 403 with tvc6 API #56
Comments
I'm having this problem too it stopped working all of a sudden. |
Yeah, started today. It's a Cloudflare challenge. <!DOCTYPE html>
<html lang="en-US">
<head>
<title>Just a moment...</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=Edge" />
<meta name="robots" content="noindex,nofollow" />
<meta name="viewport" content="width=device-width,initial-scale=1" />
<link href="/cdn-cgi/styles/challenges.css" rel="stylesheet" />
</head>
<body class="no-js">
<div class="main-wrapper" role="main">
<div class="main-content">
<h1 class="zone-name-title h1">
<img class="heading-favicon" src="/favicon.ico"
onerror="this.onerror=null;this.parentNode.removeChild(this)" />
tvc4.investing.com
</h1>
<h2 class="h2" id="challenge-running">
Checking if the site connection is secure
</h2>
<noscript>
<div id="challenge-error-title">
<div class="h2">
<span class="icon-wrapper">
<div class="heading-icon warning-icon"></div>
</span>
<span id="challenge-error-text">
Enable JavaScript and cookies to continue
</span>
</div>
</div>
</noscript>
<div id="trk_jschal_js" style="display:none;background-image:url('/cdn-cgi/images/trace/managed/nojs/transparent.gif?ray=75c30ff7bb5a6949')"></div>
<div id="challenge-body-text" class="core-msg spacer">
tvc4.investing.com needs to review the security of your connection before proceeding.
</div>
<form id="challenge-form" action="THE_ORIGINAL_REQUEST&__cf_chl_f_tk=sSANp3JtMqsjoUh50CcPGPono94Pgq8_.7BGA8I3XKM-1666114860-0-gaNycGzNB2U" method="POST" enctype="application/x-www-form-urlencoded">
<input type="hidden" name="md" value="i2FRFTE2du3Qco3QFnvnVH6.clW3hLi7rIRscIks2TM-1666114860-0-AcxJCTgnfWs_Qs4t78bTJ7zR19Nw4lWfdvIJeud2RRvvpSdF4TH2PehqmKoIphrQQlXmFrE0VL9Z5Z7suFKpcqfGvv6ctZmz1GJduxQU0aS66SjKgp7hovDYOG_ztWqJFOfLZYtpWY99YJOfYIGMhaJJy-_BrPvA3XD5gDjIA_h2wLQmleMM4gGPM8sP4TfzcrFYd8M8bBA_nDo76-ERHWgWNjmZuvJ1L7KZfzwmGsIodKaIXGQJKk6C1-pm1ZI_zIshwem03pHufykl8ARM43Y072hn1lL0XoxoiYdSnQMrExkDoOnX6Wp-WkQKGIcCaDclEFAkk7kZMSy8L8UCgb_IjW9Th8BIVrtWUElYtH61b8tanqpUUHcTB4E0FjNpMPcxpzVuqfhGgQcBMo3cPBDe2cZn5c5bOrmxIiOQ41ffU0ifXkXi0lrp-VUqRW_Qgk__QbbOWa5I0KBQhyMhZabVU5770I1LYLtVQtETweVJSXGnA1MrC2hUskCBhOj1gwzgcIpynlJqPf_aSArRfGGLahw-abD-Cy2QnAl0xtN0-YkTIouYTjEu3Z9WYelGrORWAdfQdDQHFWO0ZYlBsIn_4CUTu6ppqeWBHHPvrVRBuR3I08JYma28PH9v-z-tDsVm6Z-3TyTnhIg2VVe6Q8sbwHblVZz9Bn1suzePR4e52llJ928hugM5EXaZMoUI00qogHbkhHe_vk0ZgB3gSExR7DmmIGQAYNUQxv7K4pROCYT4Uo1Tp1vjC-w3W4DgSoUczpbQMTpMiQQo2viy6OdyKKBlo4td68IhXgSekLGUcxk6caJz_Kq6VXzDjOHGa2nvVXCk7T6ljBlUp5fa8sjtNKCrlhZpOdt3PAdBwKsOCT7_4aWyXTVSMVLkde7AIg0aDSsoO7m49QBuWSZxjJRZviMFsmC6XARLWlWUdzy53z58MwY_23bGG1Da8K-5Hg" />
<input type="hidden" name="r" value="bsg4kbcXCUmlUjRsYoAkZTuub56rjGcWQE8YX1hY.O4-1666114860-0-AbCLxJDTqdDuai3VS4783VfnL0UHFlW3M4RW7bFmUT0TWldMizfI3GdgPles+wdN7hCBnO+DsjDblLLgGoSAsrWhy7LDMOLPBU9cFwImwuX2UcgPv6c7+ueGoLc7PHRgOeJjue4jJWBHgpjqSlIShfJLgMMCD1hZbVpMyVY0sxIr019+/HRbfL3lnJxC8W2+4a0kr6rTqPGn952I4Vyw6hBbVgposhygLwduu0khqMUtuQOHYC4QFD1j8h9lU9sHl4fZMa5E8ufZiiRvqiO2muR7lOi2TnFUx83VS06iApzfQTkTWBOpUAJbA3FddVVZpiwUVw+yxFQgpDO9Md9HHokktXMOMXMV656AwnXAsRD/YHDtkhkLYAap+ESP17fTSOqKvJ7YYfWW3IfBW3tziuGAVDxrZT3Nbjvbft8rN/ho0cWJZEd1dLncsVxF+G4ebuRx1q5WmLx5nJYH33XfSL4+M44ttcDcsO4bnJNADV9ODxITkPbPdzx8IBVGxEOKo4Yp+DrmO/R38IgKqrr3UMtvQbLIhoqojzSa7Jw/ZmPBYSzXwk3bQWLiqubDmSb5xXdOwOf7VqZg+L9Gh49+R0Br/2ZCqb8bg3BPygnbFGOaBhoM9RlYGat4CzgVm3ZLC9gOtkyRLkhz3o2niFOLvzmcvlJJlwXwWJMYAQshsvLxni85x/+XQRK/cGUF8B7btxdUaGdT0pWTCgZyhvayY2tXy4ddsLIW83p2sAdLSko7u4nq8g+3fQDUJVzsmV9sSQDF2Rl16aw3cy/UwWZyGkp7Zti+AX8gZabOBQUqwKvJRHWvOwNhZlTq521mNOlMOeBbXh5qaSp57vnYTUFP85kKZ6j03S7gHdbbGG2m3sFpbrOYuMciUX4NSSRAwlLP21+I9W5U7wFk/Rkr8WNc6o2j6LeczeSNS2PC8rcDAN0L4uhHzJHek41QvyFsZX5Bm0SLDQY0fZ0kNeh3v+FqWmdXa/gEzf91nP2kdiU23Q0xqlDeVgVY/I+M03zCzbGTp6p5X4iCf1kwGU2dBTGJk2vGJztiLUUCQFvEITnj58xuIAdflIbDIGLDXS7Ft1nVKy81pxI9IrEhbvc+QVNXW0Lah5SKKXqJwJg7PRyd95aSsriVQ/agISwr8BCNAbvrMA01eUO8jTCgkubwoVqeiDdZK2csFguP+KHn7mt3sZgvwocBsLITAzFGY+c22XwCzWhKtgKWI8/H717Nm/DWMqXuU0CAZIW7jEuvQ7p+RwU+73xRCt3CCRrDxplyIEVHbmE6WAW3DMA+fqCkK3h4P3+3Ol6wW8xzccCSTQhspm1yv26v57FDJe2a/itABooMRKxxz6fzwLmKAVhQH7+gkt+fnX9wQr3EHuQ0dFM6A4C/lgXHN+le6oUCx3uUPQkOA+2YiuaTNn/DhwyWYRDzBtbLBbUYCDKBz/2VMvBW/ph2Gw9l3yn7Mx5uYDHg3utP4+nkTRLHBTzrKV5tp0tCdXIC0g0Zs9UdASPlen5ACVxGwDmqGsR7k98aM7eSFGXe9Ls/UNPki5Y5jWH3RjlNkhZP3aHViY6ArdZg64wiOQ7GMF1jPBX2ZYZO7HxFmSxjS2Zg35XW0gG/ZCs+6nINwNW0lX9/3q7mMMM9BoxVOD/6wrjWu79htO78h/VU36ne04fqOnHqf2nD2joOGHojvAtyjn7+fyk87XwD6bVFQDLr6abpWx4ge+t9HRGCr2ZpmVhWWQvJU8vkOqsFUgwfqnd+tcf+S4MrHdnm4fEgqyUIXny8BptgVWJFxQ8rWt8EKflPUO8JiaFvqpQxd7kR+DU7IrsIxTvFoeth1TWiVGDW"/>
</form>
</div>
</div>
<script>
(function(){
window._cf_chl_opt={
cvId: '2',
cType: 'managed',
cNounce: '82843',
cRay: '75c30ff7bb5a6949',
cHash: 'be597fde60719a5',
cUPMDTk: "THE_ORIGINAL_REQUEST&__cf_chl_tk=sSANp3JtMqsjoUh50CcPGPono94Pgq8_.7BGA8I3XKM-1666114860-0-gaNycGzNB2U",
cFPWv: 'b',
cTTimeMs: '1000',
cTplV: 4,
cTplB: 'cf',
cRq: {
ru: 'aHR0cHM6Ly90dmM0LmludmVzdGluZy5jb20vMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAvMC8wLzAvMC9oaXN0b3J5P3N5bWJvbD1OQVNEQVErJTNBQUFDRyZyZXNvbHV0aW9uPUQmZnJvbT0xNjY1NzA1NjAwJnRvPTE2NjU5NjQ4MDA=',
ra: 'TW96aWxsYS81LjAgKFdpbmRvd3MgTlQgMTAuMDsgV2luNjQ7IHg2NCkgQXBwbGVXZWJLaXQvNTM3LjM2IChLSFRNTCwgbGlrZSBHZWNrbykgQ2hyb21lLzEwNi4wLjAuMCBTYWZhcmkvNTM3LjM2IEVkZy8xMDYuMC4xMzcwLjM0',
rm: 'R0VU',
d: 'ciZO7j9M6V/+BOoLttyx/6/zvhioUhl3V8HJR5f4qtcwO9dzK+lS8h/HFRESEsMGJB+1mXmCUiXlKkhIOhxzO+1kzC3tOtpQitYnAKzgGJlYsarUBi4CJl33PrEiz0X0sY5GpIitN0thDqXwplEOps8LGNuis5yV/DL/9UF1uCQwg5Om4ZUor1GJx3LDh0WnKh2DnnT27IOcnbUWTsIhLrym2aCB5x8itCCwhgg9syMvQUePt5ZENxfN/ZNXVWdJSGys5j0ArzlamZpHEBpEGGwiizBP5eRLfbAoYp61nDqGF0HTcE5UVewVdV9DVsqJ365daZPv1aGEJE9+KreDXGXW+fdrmA0vR5KXBQhYF26rDcLuc/P28Y759ucTRyb4Q5RPbZlzMZm1CZuErVzdMIXshPWJljQwlSIu2KA13A/B3rfg1HVUXiIY4Lav82+LY0gcX4psQ9zZctpBm52bt1YzIyys5l48JnwINBkGCwcnGiPSZJK5FrrkmZVh3N+CIEo9XYBURG9MOGXdzMNNjBT3I9ThSCi/PwlHQAW+MrMCkn88BidoPCNBWAZw+/ySxyx0HVTOrzssW1lhQi4Jcb5Z+gOgXD8esdMOeDxUigt181VPKJUzt549IGw68kxC',
t: 'MTY2NjExNDg2MC43ODEwMDA=',
m: 'aE7puev+CE8d/wG+uUnXqlSR38vGD/7nUVnRmrSO93M=',
i1: '89LHcnKR77lGOKK2z/HK1w==',
i2: 'pHrhEoXPZxQUcKsdQNMNiA==',
zh: 'JJQg2KI/+bPgJbLHlLjmrs/mnno8aAGH5k3tm8QDk4c=',
uh: 'ndhEe3dibHzXHi71SzbRYjwpKAQEbkeBd4r+hDwx6tA=',
hh: '+t0nuRZbzd0CcVt9ZZq81dRfWmk/KHIWqVnFfjx7L+s=',
}
};
var trkjs = document.createElement('img');
trkjs.setAttribute('src', '/cdn-cgi/images/trace/managed/js/transparent.gif?ray=75c30ff7bb5a6949');
trkjs.setAttribute('style', 'display: none');
document.body.appendChild(trkjs);
var cpo = document.createElement('script');
cpo.src = '/cdn-cgi/challenge-platform/h/b/orchestrate/managed/v1?ray=75c30ff7bb5a6949';
window._cf_chl_opt.cOgUHash = location.hash === '' && location.href.indexOf('#') !== -1 ? '#' : location.hash;
window._cf_chl_opt.cOgUQuery = location.search === '' && location.href.slice(0, -window._cf_chl_opt.cOgUHash.length).indexOf('?') !== -1 ? '?' : location.search;
if (window.history && window.history.replaceState) {
var ogU = location.pathname + window._cf_chl_opt.cOgUQuery + window._cf_chl_opt.cOgUHash;
history.replaceState(null, null, "THE_ORIGINAL_REQUEST&__cf_chl_rt_tk=sSANp3JtMqsjoUh50CcPGPono94Pgq8_.7BGA8I3XKM-1666114860-0-gaNycGzNB2U" + window._cf_chl_opt.cOgUHash);
cpo.onload = function() {
history.replaceState(null, null, ogU);
};
}
document.getElementsByTagName('head')[0].appendChild(cpo);
}());
</script>
<div class="footer" role="contentinfo">
<div class="footer-inner">
<div class="clearfix diagnostic-wrapper">
<div class="ray-id">Ray ID: <code>75c30ff7bb5a6949</code></div>
</div>
<div class="text-center">Performance & security by <a rel="noopener noreferrer" href="https://www.cloudflare.com?utm_source=challenge&utm_campaign=m" target="_blank">Cloudflare</a></div>
</div>
</div>
</body>
</html> Note: I replaced the original request with Cloudflare challenges are JavaScript challenges. If this continues, Edit: Scratch that. Maybe it actually is a header issue that Cloudflare doesn't like. |
Maybe this?: https://stackoverflow.com/a/72728221 |
Soooo the day came 😞 Investing.com protected all their APIs with Cloudflare... I've tried to contact them (resent them the email I sent them a couple of weeks ago) but no news from their side so still waiting! FYI I've tried all the available APIs |
Is there any way at all that we can help? Maybe we can try sending them emails too or something like that? |
Maybe cloudscraper can help. Also someone succeeded Here is a minimal tutorial |
I've been just contacted by them this morning, they've redirected me to the proper team, but told me in advance that removing Cloudflare from their side is not an option as that has been done for security reasons, so I'll keep you all posted! 🤗 |
I'll try those later today, thanks! Anyway I think that just works with Cloudflare v1, and Investing.com is using Cloudflare v2, so it probably won't work... but I'll test it anyway 😄 |
Soooo I've re-run the tests and it seems to be working? @anarchy89 @hamzaahmedzia1 @joao-pm-santos96 @Guvalle @PogoRollo can anyone confirm? Thanks 🤗 |
no changes needed? |
It can be tuned down without completely disabling it (lower Security Level in Cloudflare will allow higher Threat Scores to have no challenges).
Doesn't seem to be working for me. Still getting Cloudflare challenges. |
No @GSLabIt, just use |
I've checked again now, and it's not working again... Sorry for the confusion, it worked for me for around 50 requests more or less! |
I'll test again tomorrow, but it seems that Investing.com is blacklisting some IPs temporarily more often than before, which means that you can send some requests but then you're blocked (Cloudflare challenges your IP address...) |
Same error for me: |
Then cloudscraper is probably the way forward. I think they support v2 for cloudflare. Also there are other options available; not verified though |
Still doesn't work for me, I think we should figure out how to bypass the cloud flare thing because they will just keep turning on the web scraping protection feature from cloud flare. I saw there was a library and I tried it https://pythonlang.dev/repo/venomous-cloudscraper/ it doesn't seem to work. |
Hi @hajimebusuzima96 I tested |
Instead of looking for ways to bypass their security measurements, I'd wait for them to come back to me with an answer on the collaboration proposal as that way we shouldn't be skipping their security and potentially breaking the terms of use of Investing.com. But as I said before, whenever this issue first happened I tested it for the sake of exploring the issue further as I didn't know it was Cloudflare at the beginning, and no luck with any of those libraries. I didn't try |
I mean that would be great, I don't even mind using their api or a paid api from their side but they don't have that, they offer a simple pro service with no data download possible. |
I know @anarchy89, but I prefer to wait until they answer me, so as to see whether we can make either |
Investing.com will most likely never loosen their policy on bots unless they're paid to do so. Bots use the resources of investing.com without viewing their ads, meaning investing.com effectively loses money to bots. A selenium or other browser-based solution might work by generating a different fingerprint for CloudFlare which may be whitelisted, but that would mean heavier, local recourse usage from the bot. Obviously, a lighter browser such as requests or httpx is preferred, but "you gotta do what you gotta do" |
So I've just tested Look: (investiny-py3.9) alvarobartt@Alvaros-MacBook-Air investiny % poetry run python
Python 3.9.6 (default, Aug 5 2022, 15:21:02)
[Clang 14.0.0 (clang-1400.0.29.102)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from investiny import historical_data
>>> d = historical_data(investing_id=6408)
>>> d
{'date': ['09/26/2022', '09/27/2022', '09/28/2022', '09/29/2022', '09/30/2022', '10/03/2022', '10/04/2022', '10/05/2022', '10/06/2022', '10/07/2022', '10/10/2022', '10/11/2022', '10/12/2022', '10/13/2022', '10/14/2022', '10/17/2022', '10/18/2022', '10/19/2022', '10/20/2022', '10/21/2022'], 'open': [149.66000366211, 152.74000549316, 147.63999938965, 146.10000610352, 141.2799987793, 138.21000671387, 145.0299987793, 144.07499694824, 145.80999755859, 142.53999328613, 140.41999816895, 139.89999389648, 139.13000488281, 134.99000549316, 144.30999755859, 141.06500244141, 145.49000549316, 141.69000244141, 143.02000427246, 142.96000671387], 'high': [153.7700958252, 154.7200012207, 150.64140319824, 146.7200012207, 143.10000610352, 143.07000732422, 146.2200012207, 147.38000488281, 147.53999328613, 143.10000610352, 141.88999938965, 141.35000610352, 140.36000061035, 143.58999633789, 144.52000427246, 142.89999389648, 146.69999694824, 144.94920349121, 145.88999938965, 147.83999633789], 'low': [149.63999938965, 149.94500732422, 144.83999633789, 140.67999267578, 138, 137.68499755859, 144.25999450684, 143.00999450684, 145.2200012207, 139.44500732422, 138.57290649414, 138.2200012207, 138.16000366211, 134.36999511719, 138.19000244141, 140.27000427246, 140.61000061035, 141.5, 142.64999389648, 142.67999267578], 'close': [150.77000427246, 151.75999450684, 149.83999633789, 142.47999572754, 138.19999694824, 142.44999694824, 146.10000610352, 146.39999389648, 145.42999267578, 140.08999633789, 140.41999816895, 138.97999572754, 138.33999633789, 142.99000549316, 138.38000488281, 142.41000366211, 143.75, 143.86000061035, 143.38999938965, 147.27000427246], 'volume': [93339000, 84443000, 146691008, 128138000, 124925000, 114312000, 87134000, 79148000, 68402000, 85926000, 74591000, 77034000, 69833000, 112876000, 88237000, 84684000, 98716000, 61758000, 64277000, 85641896]} |
I'm still getting '403' errors right from the start. |
The '403' error appears to me as well right from the start. |
Ok @marcnshapiro @InnovArul let me re-trigger the CI/CD pipeline to check whether it's working or not, as for me it's working fine locally! |
@alvarobartt They do allow some level of "free" use of their site - think of ppl behind a firewall that's blocking ads and stuff - but they don't want massive scraping, which cost resources (and could be a competitor). |
Hello @alvarobartt and thank you for working on both, |
@alvarobartt i was getting 403 too but changing user-agent header solved it . |
Hi @thrasher456, I'm happy to know more about this issue, would you mind sharing a code-snippet in Python so that I can try to replicate? Thanks in advance! |
@alvarobartt so the main change i did was to change user-agent to cloudflare in utils.py and send a cloudflare cookie as well like below
one idea is to use httpx.cookie to save the "__cf_bm" cookie and refresh it with every request , this will make sure we always pass cloudflare check . But the initial __cf_bm cookie must be given normally atleast once by user , we can then store the refreshing __cf_bm cookie from response . Attaching utils.py & tmp.py with txt extension |
I guess we'll need to pass "startup" cookies and headers to investpy, then internally investpy will have to update them for every request? |
I guess they Cloudfare is checking the TLS fingerprint. 403 will come up if the TLS fingerprint is disabled in Cloudflare bot manager. None of the major web browser users the OpenSSL, which used by python. If that is the case, do you think that we can bypass the cloudflare bot by developing a tool that is using boringssl (same as chrome)? |
Hello, first thank you for the amazing project. When i tried examples of the documentation i still get 403 error :/, even with the new api.
I tried:
and received:
I am using Anaconda at this point, but driven by curiosity i tried open the api link on the browser and still get the 403 error, but after refresh all data are loaded correctly
EDIT: in the browser tvc4 had the same behavior. I think in some header or cookie problem, maybe ?
The text was updated successfully, but these errors were encountered: