january 25, 2025 // 9 min read
hey so uhh quick intro
for the past like two years i've been working on mtcute, an mtproto client library in typescript. in case you didn't know, mtproto FUCKING SUCKS, but that's for another post.
and i've always wanted to build an online in-browser interactive tool that would allow me to poke around with telegram api, with minimum friction and maximum convenience. there are quite a few use-cases for this, and im not really going to go into detail here, but trust me, it's a pretty cool idea.
and so after a few weeks of work, i made a thing: play.mtcute.dev!
and while building it, i encountered quite a few issues that i want to share
and as such, we need to somehow make it available to the user's code.
something like https://esm.sh
would probably work, but i really don't trust such services. and hosting something like this myself would defeat the entire point of it being fully in-browser.
so i decided to go with a different approach.
instead of using a cdn, i download the library directly from registry.npmjs.org
(along with all its dependencies), untar and save them to indexeddb. and then... well, we need to somehow run it.
one might ask how is npm better than esm.sh in terms of security, and to that i have no definite answer, but like, if npm starts serving malicious code half the internet will be screwed anyway
initially i wanted to simply go with something like esbuild-wasm
with user's code as an entrypoint, bundle everything into a single file and then just run that file as a web worker.
except esbuild-wasm weighs about 11mb 🥴
not even considering the bundling performance overhead, that's a lot of wasted bandwidth. surely there's a better way?
a good friend of mine (s/o @kamillaova) reminded me about import maps.
in case you haven't heard of them, it's a recent addition to the web platform, allowing you to map esm imports to somewhere.
a simple esm.sh
sourcemap for mtcute would look something like this:
<script type="importmap">
{
"imports": {
"@mtcute/web": "https://esm.sh/@mtcute/web",
}
}
</script>
the issues were immediately obvious, however:
but still, import maps would allow us to avoid bundling at all!
and the issues above could probably be worked around.
so i still decided to give it a try.
to make the library available by url without any external backend, we can just serve it from a service worker.
service workers allow intercepting all requests on our origin, and they also have full access
to indexeddb (where we store the downloaded code).
so serving the library from a service worker is as simple as:
globalThis.addEventListener('fetch', (event) => {
if (event.request.url.startsWith('/sw/runtime/')) {
event.respondWith(serveFromIdb(event.request.url))
}
})
and then we can just use that url in the import map:
<script type="importmap">
{
"imports": {
"@mtcute/web": "/sw/runtime/@mtcute/web/index.js",
...
}
}
</script>
note: since mtcute and all its deps use npm-s
package.json
, when generating the import map we need to keep in mind the respectiveexports
fields (and alsomodule/main/browser
god damnit 🤮)
the remaining issues are very much linked together – we need to provide import maps at runtime.
since we can't use a web worker, let's use an iframe instead! and since we can't add import maps at runtime, we can just generate the entire html page on the fly:
const html = `
<html>
<script type="importmap">${generateImportMap()}</script>
...some more html idk...
</html>
`
const url = URL.createObjectURL(new Blob([html], { type: 'text/html' }))
const iframe = document.createElement('iframe')
iframe.src = url
document.body.appendChild(iframe)
except... BAM!
due to this chrome bug, our generated iframe won't be able to load the libraries from our service worker, because it's not considered the same origin
no biggie, let's just serve the html directly from our service worker:
globalThis.addEventListener('fetch', (event) => {
if (event.request.url === '/sw/runtime/_iframe.html') {
event.respondWith(generateIframeHtml())
}
})
and at that point, everything seemed to work??
as you probably know, javascript at its core is single-threaded. this is useful in most cases, however can be quite annoying in some others. especially for this particular one – a repl.
in case a user ends up writing a computation-heavy task (or just accidentally do a while (true) {}
),
the entire page will freeze and potentially crash. and it is something i specifically don't want in our case.
an obvious solution – just run the user-provided code in a web worker..?
except we can't (see above).
but wait. we are already running the user's code in an iframe!
isn't it already a separate thread?
NO
i used to think that it would run in a separate thread too, honestly.
but when i actually tried to run a while (true) {}
in an iframe, it froze the entire page, not just the iframe.
the same issue can be seen in many in-browser playgrounds out there, like solid.js one, vue.js one, and probably more...
this section is very much a probably, i didn't do a lot of research on this, but this sounds reasonable enough
afaiu, the fact that it's run on the same thread is primarily due to the contentWindow
api.
see, when the iframe is same-origin, we can actually access its DOM from the parent window:
const iframe = document.querySelector('iframe')
iframe.contentWindow.document.body.innerHTML = 'meow'
and because of that, browsers have to share the same javascript runtime thread between the iframe and the parent window.
for cross-origin iframes, however, the only way of communication is via postMessage
(similar to web workers!),
and as such the browser can create a separate thread for it.
i found this post that explains the issue in detail, as well as providing some pointers on how to work around it.
tldr – this is very much implementation-specific. but usually, the iframe is considered a cross-origin
if the etld+1
of the parent and the child are different.
etld is the effective top-level domain.
(dont worry, i haven't heard this term before either)example.com and very.example.com have the same etld+1 (
example.com
), but example.com and example.co.uk don't.
one way we can force the browser to isolate the iframe is using the sandbox
attribute:
<iframe sandbox="allow-scripts"></iframe>
but... well, it's no longer same-origin :D
and that's an issue!
because we can no longer access our service worker from the iframe and load the libraries from it.
so basically to make things work the way i intended, i would need some kind of sandbox
attribute that would
isolate the javascript runtime thread (and disable the contentWindow
api that i dont even use anyway),
while the frame is still considered same-origin.
and browsers don't have anything like that!! :<
at this point i basically had two options:
postMessage
i went with the latter because i really wanted to make my repl resilient to broken user code.
important!
having two origins might be a security concern! what if a bad actor embeds my "worker" iframe in their website and steals everything?
i had to be extra careful to verify the origin of every embedded message, to make sure it's our "frontend" talking to our "worker" iframe, and not some malicious website.
at this point i already had like 90% of the project finished, so refactoring everything to two-origin architecture took some effort, but it was definitely worth it.
i had to move the following to the "worker" iframe:
and the overall architecture ended up looking something like this:
postMessage
to talk to...phew. that's some enterprise-grade backend architecture, right in your browser!
i really hope that one day browsers will make this kind of stuff easier to implement 🙏
at this point everything was working fine... in chrome :D
as soon as i opened the page in firefox, i was greeted with an incredibly helpful "The operation is insecure"
after some digging, it turned out that firefox has something called state partitioning
tldr: normally browsers always keyed websites' data by their origin. but trackers can (and do! cant have shit in this economy) abuse this to track users across websites.
a simple example of that would be:
// https://tracking.tei.su/get-user-id.html
globalThis.addEventListener('message', (event) => {
if (!localStorage.userId) localStorage.userId = crypto.randomUUID()
event.source.postMessage(localStorage.userId)
})
// which can then be used to track users across websites by simply doing:
const iframe = document.createElement('iframe')
iframe.src = 'https://tracking.tei.su/get-user-id.html'
document.body.appendChild(iframe)
iframe.addEventListener('message', (event) => {
console.log('you are %s!', event.data)
})
to avoid this, firefox keys the data by a combination of the iframe's origin and the top window's origin. this way, the above code would return different results for different websites.
this... doesn't really sound like an issue for our case though?
ikr?? we barely store anything outside of the worker's origin, and only access our worker from a single origin.
but for WHATEVER REASON (likely due to some bug in firefox) our runner iframe seemed to have a separate service worker from the worker iframe. and a separate indexeddb. and only in some cases. 🥴
some stuff did seemingly get fixed by simply updating firefox to the latest version, and some other stuff i had to refactor from the worker iframe to the runner iframe. ugh, so annoying.
...
i have no idea how people even write outros so uhh
thanks for reading this rambling of a post i guess?
ok bye