september 13, 2020 // 9 min read
disclaimer: this post is a translation of a post i wrote almost 5 years ago in russian on teletype.
a lot has probably changed since then (and im not even sure if it was worth translating), and these days i would probably employ very different techniques, but i just really wanted to preserve the post and remove my teletype account, so here we are
hewwo
about a month ago i was poking at cloudflare uam (hereafter just uam), but i couldn't figure out how to write a post about it.
uam is that annoyance you see for a while before you can visit some sites, and it checks something. i wanted to make a bypass for it, without using a headless browser
spoiler: i wasn't able to~
the overall structure of uam is explained here fairly well, tldr:
/cdn-cgi/challenge-platform/orchestrate/captcha/v1
/cdn-cgi/challenge-platform/generate/ov1/...
, resulting in another JS scriptcf_chl_rc_ni
)POST
request to the original URL with ?__cf_chl_jschl_tk__
query param set, or the page refreshes (and requests a new challenge)sounds uncomplicated?.. we just need to parse the code and send it what we need?..
detecting the page containing UAM is fairly simple: the response always has a 503 status code, it always has Server: cloudflare
header,
and the response is an html page containing an element form.challenge-form
.
if we open the code of the UAM page, we immediately see this:
these params change on each page refresh - obviously, these are some params of the challenge, generated by the server. it's fairly trivial to parse them since this is just json5.
right after that a challenge script is loaded:
and.. (who would have thought) it consists of obfuscated garbage:
it's immediately obvious that this script was obfuscated using a modificated obfuscator.io,
but the strings array is encoded as '...'.split(',')
instead of ["a",...]
.
this obfuscator gives itself away by: decoding strings with b('0xXX')
(purple), shifting the array of strings (green) and using random 5-character strings (red).
if we deobfuscate this goodiness and rename a few things, it becomes pretty clear what's going on:
cf_chl_prog=e
is set/generate/ov/...
request is sent with the challenge params and some weird stringand here's the first problem: the string that we need to add is hidden inside the obfuscated code.
it's fairly trivial to find it with a regex, though, if we make a bet about how this string is generated (i think it's something like ${Math.random()}:${floor(Date.now() / 1000)}:${sha256(previousString + SECRET_SALT)}
).
who the fuck is window.sendRequest()
though? it seems to be a function defined in the same obfuscated script, which makes a request to the given url and eval()
-s the result:
the one thing that immediately stands out is compressToEncodedURIComponent
.
if we google the name, https://github.com/pieroxy/lz-string/ will be one of the first results. to verify, we can compare the implementation:
looks similar!? but what is that string instead of keyStrUriSafe? it doesn't match the string in the lz-string source code:
and here's the fun part. this string is not a constant, and changes between challenges! so, to properly decode the body, we somehow need to extract this string from the obfuscated source code...
first of all, the signature above - '...'['charAt'](...)
- is, unsurprisingly, not unique. and then – this signature isn't even always correct!
remember, this script is obfuscated with obfuscator.io? its "Control Flow Flattening" feature might randomly enter this function and extract this string into the JS object, for example like this:
and in some cases, "Dead Code Injection" might transform this function into something like this:
to solve this unfortunate problem, i used a very powerful called named regexes (fuck ast, all my homies parse code with regex).
explaining the regexes themselves would be super boring and tedious (and they're also incredibly weird), so i'll just give the gist:
- let
?N
(N
is a number) be the number of the match group- let
?N<T>
be the number of the match group with typeT
- let
^N
be the regex reference to theN
-th match group from the previous regex- let
^^N
be the regex reference to theN
-th match group from the pre-previous regex
?1<JSIdent>["compressToEncodedURIComponent"]=?2<JSIdent>['?3<JSIdent>']
'^3<JSIdent>': function(?1<JSIdent>)
, and parse its body^^2['?1<JSIdent>'](^1,6,function(?2<JSIdent>){return ?3<JSString>["charAt"](\2})
^^2['?1<JSIdent>'](^1,6,function(?2<JSIdent>){return ?3['?4']["charAt"](\2)})
?1["^3"]=?2
, where ?2
is either a JS string or a regex reference, and ?1
is the keyok so now we have the alphabet used to encode the request to the server. we can now try running the script and see what the server returns.
the hardest part is over, right???
the server returns random garbage too (so unexpected, much wow):
which is actually fairly simple do decode by looking at how this is done in the UAM code (in essence - js UTF-16 weirdness). after decoding the random garbage becomes... random garbage:
now here some custom (or maybe a commercial one, haven't seen it before) obfuscator used, and uses the Ray-ID
of the challenge as a key to decode strings.
if we deobfuscate it, a fairly interesting code pops out, which is too big to screenshot, so here's a pastebin.
the script contains a sequence of several mini-tests of various browser apis. part of them aren't even executed (additional obfuscation or a bug?), part of them is additionaly obfuscated with obfuscator.io, and the results of each test are appended to the "log" in the context of the challenge.
an incomplete list of browser features that are checked:
+![]
etc.)eval()
windows.SHA256
, which was defined in the original obfuscated script (fun fact: the sha256 impl is real. i would change the algorithm a bit for fun :3)navigator.*
apis
global
and process
(global objects in Node.js)after this script is done, the entire context (along with the "log") is sent to the server, analyzed, and the server decides what to do with me (redirect to the site, give a captcha or send me another challenge).
i didn't even bother with captcha, sending it to a captcha solving saas, so that's fairly simple.
the inner workings of UAM reminded me a lot of a software-only SafetyNet, but on a much smaller scale (cloudflare doesn't even bother with wasm, just obfuscators).
i tried to use JSDOM to emulate all of the apis above, but failed miserably, so kinda gave up and wrote a tiny script with puppeteer
100 sloc script written in 30 minutes with puppeteer works, 700+ sloc script i spent a few days on with a lot of 228iq stuff - doesn't
come up with your own conclusions :3
and at last, a few memes from inside the final obfuscated script:
there won't be any. neither the puppeter version, nor the original version. first of all, the code is shit, secondly, i just dont want to~
note as of translating: im not sure if i even have the sources anymore lol
i described a bunch of stuff here, so good luck and please reach out if you end up with something interesting :3