hewwo / blog / donate
how i built mtcute repl

january 25, 2025 // 9 min read

hey so uhh quick intro

for the past like two years i've been working on mtcute, an mtproto client library in typescript. in case you didn't know, mtproto FUCKING SUCKS, but that's for another post.

and i've always wanted to build an online in-browser interactive tool that would allow me to poke around with telegram api, with minimum friction and maximum convenience. there are quite a few use-cases for this, and im not really going to go into detail here, but trust me, it's a pretty cool idea.

and so after a few weeks of work, i made a thing: play.mtcute.dev!
and while building it, i encountered quite a few issues that i want to share

mtcute is a library

and as such, we need to somehow make it available to the user's code.

something like https://esm.sh would probably work, but i really don't trust such services. and hosting something like this myself would defeat the entire point of it being fully in-browser.

so i decided to go with a different approach.

instead of using a cdn, i download the library directly from registry.npmjs.org (along with all its dependencies), untar and save them to indexeddb. and then... well, we need to somehow run it.

one might ask how is npm better than esm.sh in terms of security, and to that i have no definite answer, but like, if npm starts serving malicious code half the internet will be screwed anyway

initially i wanted to simply go with something like esbuild-wasm with user's code as an entrypoint, bundle everything into a single file and then just run that file as a web worker.

except esbuild-wasm weighs about 11mb 🥴

not even considering the bundling performance overhead, that's a lot of wasted bandwidth. surely there's a better way?

import maps

a good friend of mine (s/o @kamillaova) reminded me about import maps.

in case you haven't heard of them, it's a recent addition to the web platform, allowing you to map esm imports to somewhere.

a simple esm.sh sourcemap for mtcute would look something like this:

<script type="importmap">
{
  "imports": {
    "@mtcute/web": "https://esm.sh/@mtcute/web",
  }
}
</script>

the issues were immediately obvious, however:

  1. the library must be available by url
  2. dynamic import maps are not invented yet, meaning we can't add an import map at runtime
  3. import maps are not supported by web workers

but still, import maps would allow us to avoid bundling at all!

and the issues above could probably be worked around.
so i still decided to give it a try.

importing by url

to make the library available by url without any external backend, we can just serve it from a service worker.

service workers allow intercepting all requests on our origin, and they also have full access to indexeddb (where we store the downloaded code).
so serving the library from a service worker is as simple as:

globalThis.addEventListener('fetch', (event) => {
  if (event.request.url.startsWith('/sw/runtime/')) {
    event.respondWith(serveFromIdb(event.request.url))
  }
})

and then we can just use that url in the import map:

<script type="importmap">
{
  "imports": {
    "@mtcute/web": "/sw/runtime/@mtcute/web/index.js",
    ...
  }
}
</script>

note: since mtcute and all its deps use npm-s package.json, when generating the import map we need to keep in mind the respective exports fields (and also module/main/browser god damnit 🤮)

dynamic import maps

the remaining issues are very much linked together – we need to provide import maps at runtime.

since we can't use a web worker, let's use an iframe instead! and since we can't add import maps at runtime, we can just generate the entire html page on the fly:

const html = `
  <html>
  <script type="importmap">${generateImportMap()}</script>
  ...some more html idk...
  </html>
`

const url = URL.createObjectURL(new Blob([html], { type: 'text/html' }))
const iframe = document.createElement('iframe')
iframe.src = url
document.body.appendChild(iframe)

except... BAM!

due to this chrome bug, our generated iframe won't be able to load the libraries from our service worker, because it's not considered the same origin

no biggie, let's just serve the html directly from our service worker:

globalThis.addEventListener('fetch', (event) => {
  if (event.request.url === '/sw/runtime/_iframe.html') {
    event.respondWith(generateIframeHtml())
  }
})

and at that point, everything seemed to work??

threading in browsers

as you probably know, javascript at its core is single-threaded. this is useful in most cases, however can be quite annoying in some others. especially for this particular one – a repl.

in case a user ends up writing a computation-heavy task (or just accidentally do a while (true) {}), the entire page will freeze and potentially crash. and it is something i specifically don't want in our case.

an obvious solution – just run the user-provided code in a web worker..?

except we can't (see above).

but wait. we are already running the user's code in an iframe!
isn't it already a separate thread?

NO

i used to think that it would run in a separate thread too, honestly.

but when i actually tried to run a while (true) {} in an iframe, it froze the entire page, not just the iframe.

the same issue can be seen in many in-browser playgrounds out there, like solid.js one, vue.js one, and probably more...

but why?

this section is very much a probably, i didn't do a lot of research on this, but this sounds reasonable enough

afaiu, the fact that it's run on the same thread is primarily due to the contentWindow api.

see, when the iframe is same-origin, we can actually access its DOM from the parent window:

const iframe = document.querySelector('iframe')
iframe.contentWindow.document.body.innerHTML = 'meow'

and because of that, browsers have to share the same javascript runtime thread between the iframe and the parent window.

for cross-origin iframes, however, the only way of communication is via postMessage (similar to web workers!), and as such the browser can create a separate thread for it.

what even is a cross-origin iframe?

i found this post that explains the issue in detail, as well as providing some pointers on how to work around it.

tldr – this is very much implementation-specific. but usually, the iframe is considered a cross-origin if the etld+1 of the parent and the child are different.

etld is the effective top-level domain.
(dont worry, i haven't heard this term before either)

example.com and very.example.com have the same etld+1 (example.com), but example.com and example.co.uk don't.

one way we can force the browser to isolate the iframe is using the sandbox attribute:

<iframe sandbox="allow-scripts"></iframe>

but... well, it's no longer same-origin :D

and that's an issue!
because we can no longer access our service worker from the iframe and load the libraries from it.

what do we do?

so basically to make things work the way i intended, i would need some kind of sandbox attribute that would isolate the javascript runtime thread (and disable the contentWindow api that i dont even use anyway), while the frame is still considered same-origin.

and browsers don't have anything like that!! :<

at this point i basically had two options:

  • give up on trying to separate the worker into a separate thread
  • make an actual cross-origin iframe where most of the work would happen, and our "main" window would just be a frontend talking to it via postMessage

the great separation

i went with the latter because i really wanted to make my repl resilient to broken user code.

important!

having two origins might be a security concern! what if a bad actor embeds my "worker" iframe in their website and steals everything?

i had to be extra careful to verify the origin of every embedded message, to make sure it's our "frontend" talking to our "worker" iframe, and not some malicious website.

at this point i already had like 90% of the project finished, so refactoring everything to two-origin architecture took some effort, but it was definitely worth it.

i had to move the following to the "worker" iframe:

  • authorization and session management
  • library downloading
  • the service worker, along with iframe html generation

and the overall architecture ended up looking something like this:

G cluster_worker worker origin sw service worker runner runner iframe sw->runner  serves idb indexeddb sw->idb stores libs worker worker iframe worker->sw  talks runner->idb stores sessions frontend frontend frontend->worker embeds + talks frontend->runner     embeds + talks               
  • frontend is the actual app the user interacts with. just a normal frontend app, but instead of some rest api, it uses postMessage to talk to...
  • worker iframe – which implements most of the business logic, as well as manages the service worker and storage
    • service worker is used to serve the library code, as well as the...
    • runner iframe – which is the "sandbox" in which the user's code is actually run

phew. that's some enterprise-grade backend architecture, right in your browser!
i really hope that one day browsers will make this kind of stuff easier to implement 🙏

cross-origin is not the silver bullet, actually

at this point everything was working fine... in chrome :D

as soon as i opened the page in firefox, i was greeted with an incredibly helpful "The operation is insecure"

after some digging, it turned out that firefox has something called state partitioning

tldr: normally browsers always keyed websites' data by their origin. but trackers can (and do! cant have shit in this economy) abuse this to track users across websites.

a simple example of that would be:

// https://tracking.tei.su/get-user-id.html
globalThis.addEventListener('message', (event) => {
  if (!localStorage.userId) localStorage.userId = crypto.randomUUID()
  event.source.postMessage(localStorage.userId)
})

// which can then be used to track users across websites by simply doing:
const iframe = document.createElement('iframe')
iframe.src = 'https://tracking.tei.su/get-user-id.html'
document.body.appendChild(iframe)

iframe.addEventListener('message', (event) => {
  console.log('you are %s!', event.data)
})

to avoid this, firefox keys the data by a combination of the iframe's origin and the top window's origin. this way, the above code would return different results for different websites.

this... doesn't really sound like an issue for our case though?

ikr?? we barely store anything outside of the worker's origin, and only access our worker from a single origin.

but for WHATEVER REASON (likely due to some bug in firefox) our runner iframe seemed to have a separate service worker from the worker iframe. and a separate indexeddb. and only in some cases. 🥴

some stuff did seemingly get fixed by simply updating firefox to the latest version, and some other stuff i had to refactor from the worker iframe to the runner iframe. ugh, so annoying.

...

i have no idea how people even write outros so uhh
thanks for reading this rambling of a post i guess?

ok bye