hewwo / blog / donate / tools
inside cloudflare uam

september 13, 2020 // 9 min read

disclaimer: this post is a translation of a post i wrote almost 5 years ago in russian on teletype.

a lot has probably changed since then (and im not even sure if it was worth translating), and these days i would probably employ very different techniques, but i just really wanted to preserve the post and remove my teletype account, so here we are

hewwo

about a month ago i was poking at cloudflare uam (hereafter just uam), but i couldn't figure out how to write a post about it.

uam is that annoyance you see for a while before you can visit some sites, and it checks something. i wanted to make a bypass for it, without using a headless browser

spoiler: i wasn't able to~

this pokemon
this pokemon

outside

the overall structure of uam is explained here fairly well, tldr:

  • when the UAM page loads, it loads a JS script from /cdn-cgi/challenge-platform/orchestrate/captcha/v1
  • this scripts generates something and makes a POST request to /cdn-cgi/challenge-platform/generate/ov1/..., resulting in another JS script
  • that script also executes and does something, after which it makes a request to the same URL (but with a different body and a cf_chl_rc_ni)
  • finally, it sends a POST request to the original URL with ?__cf_chl_jschl_tk__ query param set, or the page refreshes (and requests a new challenge)

sounds uncomplicated?.. we just need to parse the code and send it what we need?..

inside

detecting the page containing UAM is fairly simple: the response always has a 503 status code, it always has Server: cloudflare header, and the response is an html page containing an element form.challenge-form.

if we open the code of the UAM page, we immediately see this:

these params change on each page refresh - obviously, these are some params of the challenge, generated by the server. it's fairly trivial to parse them since this is just json5.

right after that a challenge script is loaded:

and.. (who would have thought) it consists of obfuscated garbage:

obfuscated script, part 1

it's immediately obvious that this script was obfuscated using a modificated obfuscator.io, but the strings array is encoded as '...'.split(',') instead of ["a",...].

this obfuscator gives itself away by: decoding strings with b('0xXX') (purple), shifting the array of strings (green) and using random 5-character strings (red).

if we deobfuscate this goodiness and rename a few things, it becomes pretty clear what's going on:

  • some challenge for the context is created, in which the params from the received object (challengeOptions=window._cf_chl_opt) are added, and some "log" is created
  • the log is immediately added with the time of the challenge start
  • cookie cf_chl_prog=e is set
  • /generate/ov/... request is sent with the challenge params and some weird string

and here's the first problem: the string that we need to add is hidden inside the obfuscated code. it's fairly trivial to find it with a regex, though, if we make a bet about how this string is generated (i think it's something like ${Math.random()}:${floor(Date.now() / 1000)}:${sha256(previousString + SECRET_SALT)}).

who the fuck is window.sendRequest() though? it seems to be a function defined in the same obfuscated script, which makes a request to the given url and eval()-s the result:

eww, xhr

the one thing that immediately stands out is compressToEncodedURIComponent. if we google the name, https://github.com/pieroxy/lz-string/ will be one of the first results. to verify, we can compare the implementation:

in the uam script
in the uam script
in lz-string sourtce code
in lz-string sourtce code

looks similar!? but what is that string instead of keyStrUriSafe? it doesn't match the string in the lz-string source code:

and here's the fun part. this string is not a constant, and changes between challenges! so, to properly decode the body, we somehow need to extract this string from the obfuscated source code...

parsing js with regexes is fun

first of all, the signature above - '...'['charAt'](...) - is, unsurprisingly, not unique. and then – this signature isn't even always correct!

remember, this script is obfuscated with obfuscator.io? its "Control Flow Flattening" feature might randomly enter this function and extract this string into the JS object, for example like this:

an unfortunate situation
an unfortunate situation

and in some cases, "Dead Code Injection" might transform this function into something like this:

wtf, there are two strings here??
wtf, there are two strings here??

to solve this unfortunate problem, i used a very powerful called named regexes (fuck ast, all my homies parse code with regex).

explaining the regexes themselves would be super boring and tedious (and they're also incredibly weird), so i'll just give the gist:

  • let ?N (N is a number) be the number of the match group
  • let ?N<T> be the number of the match group with type T
  • let ^N be the regex reference to the N-th match group from the previous regex
  • let ^^N be the regex reference to the N-th match group from the pre-previous regex
  1. find the compressToEncodedURIComponent function signature: ?1<JSIdent>["compressToEncodedURIComponent"]=?2<JSIdent>['?3<JSIdent>']
  2. find the function prelude: '^3<JSIdent>': function(?1<JSIdent>), and parse its body
  3. find the function we're interested in (where the 2nd argument is 6), assuming Control Flow Flattening is not present: ^^2['?1<JSIdent>'](^1,6,function(?2<JSIdent>){return ?3<JSString>["charAt"](\2})
  4. if we found it - great, the string (group 3) is the one we need. if not - try a different signature, assuming CFF: ^^2['?1<JSIdent>'](^1,6,function(?2<JSIdent>){return ?3['?4']["charAt"](\2)})
  • if we found it, search across the entire file for the key (group 3, in my experience there are no collisions): ?1["^3"]=?2, where ?2 is either a JS string or a regex reference, and ?1 is the key
  1. if we didn't find it - give up and try refetching the challenge, hoping that DCI won't be an issue

ok so now we have the alphabet used to encode the request to the server. we can now try running the script and see what the server returns.

the hardest part is over, right???

obfuscated script, part 2

the server returns random garbage too (so unexpected, much wow):

mommy i dont wanna die
mommy i dont wanna die

which is actually fairly simple do decode by looking at how this is done in the UAM code (in essence - js UTF-16 weirdness). after decoding the random garbage becomes... random garbage:

very much dont wanna die
very much dont wanna die

now here some custom (or maybe a commercial one, haven't seen it before) obfuscator used, and uses the Ray-ID of the challenge as a key to decode strings. if we deobfuscate it, a fairly interesting code pops out, which is too big to screenshot, so here's a pastebin.

the script contains a sequence of several mini-tests of various browser apis. part of them aren't even executed (additional obfuscation or a bug?), part of them is additionaly obfuscated with obfuscator.io, and the results of each test are appended to the "log" in the context of the challenge.

an incomplete list of browser features that are checked:

  • interpretator typing (stuff with +![] etc.)
  • eval()
  • availability of windows.SHA256, which was defined in the original obfuscated script (fun fact: the sha256 impl is real. i would change the algorithm a bit for fun :3)
  • navigator.* apis
  • DOM apis (creation, interaction, deletion of elements + checking of elements in the original html returned by the server)
  • checking for global and process (global objects in Node.js)
  • Canvas API
  • WebSocket API
  • Image API
  • Cookie API
  • Errors & Stack traces (easily detects vms)
  • and a lot more, probably

after this script is done, the entire context (along with the "log") is sent to the server, analyzed, and the server decides what to do with me (redirect to the site, give a captcha or send me another challenge).

i didn't even bother with captcha, sending it to a captcha solving saas, so that's fairly simple.

what is life

the inner workings of UAM reminded me a lot of a software-only SafetyNet, but on a much smaller scale (cloudflare doesn't even bother with wasm, just obfuscators).

i tried to use JSDOM to emulate all of the apis above, but failed miserably, so kinda gave up and wrote a tiny script with puppeteer

100 sloc script written in 30 minutes with puppeteer works, 700+ sloc script i spent a few days on with a lot of 228iq stuff - doesn't

come up with your own conclusions :3

and at last, a few memes from inside the final obfuscated script:

a message to those who came this far and a reminder that everything is fine and you shouldnt kys
a message to those who came this far and a reminder that everything is fine and you shouldnt kys
a reference to an ancient 200x meme
a reference to an ancient 200x meme
i dont think i will
i dont think i will

source code?

there won't be any. neither the puppeter version, nor the original version. first of all, the code is shit, secondly, i just dont want to~

note as of translating: im not sure if i even have the sources anymore lol

i described a bunch of stuff here, so good luck and please reach out if you end up with something interesting :3