Please read my blog's comment policy here.
Despite its role as the cornerstone of web application security, it’s clear that many (most?) web professionals do not understand Same Origin Policy (SOP), or hold one or more misconceptions about what SOP requires.
It’s a big topic, and I don’t plan to address it all on this quiet Friday morning. This post will be the first in a multi-part series about SOP and what it means for the web platform.
In order to understand SOP, you must first understand what an origin is. For the purposes of this post, I’ll simply give the simplified explanation that an origin is a string consisting of the protocol/scheme and fully qualified hostname of a given piece of content. A webpage from http://www.example.com/a.htm has the origin “http,www.example.com”.
The reality is a bit more complicated than that; in every browser except Internet Explorer, the origin includes the server’s port (if specified), while in IE, the content’s Security Zone is a part of the origin instead. However, neither of these is relevant in most cases.
Over the years, I’ve heard many incorrect statements about SOP. For instance: “SOP means that one page cannot use resources from a different server” (wrong!) or “SOP means that one page cannot send data to a different server” (wrong!) or “SOP means that pages from one site are completely immune to tampering by other sites” (wrong!).
Each of these myths has a grain of truth in it, but with Same Origin Policy, the devil is most definitely in the details. Virtually every restriction mentioned in this blog post has one or more exceptions, but we’ll start with the basics and leave the exceptions to future posts.
While my heritage is DOS, not Unix, I like to think of Same Origin Policy using the Read, Write, Execute (RWX) permissions model commonly used in Unix filesystems (and NTFS, of course). In this model, a user/process is granted permission to perform zero or more of these three operations on a given file.
I like explaining SOP using the RWX model because it is familiar to many technical people. The simplest explanation of SOP is that Origin “A” has the following permissions:
Now, each of these points has subtleties that I will address in future posts, but the rest of today’s post is about the first one: Same Origin Policy means that pages from Origin “A” may not read the contents of resources in Origin “B”.
In this model, “deny read” means that script running in Origin “A” must not be permitted to utilize content from Origin “B” in such a way that the resource from “B” can be effectively reconstructed by “A". So, for instance, a webpage from Origin “A”:
…and so on.
If content from one origin was able to read content loaded from another origin, one site could easily attack another site. For instance, an IFRAME from attacker.com could read the contents of another IFRAME from yourbank.com. And so forth. The number and scope of attacks against sensitive resources would only be limited by the attacker’s imagination.
A key point in all of this is that abusing the user’s browser to load content from the victim server sends that user’s authentication (cookies, credentials, etc) to the victim server. The attacker needs these credentials to be sent to the victim server in order to get access to content worth stealing.
Stated another way, if the attacker could directly download protected resources from yourbank.com without using your browser, he absolutely would do so. But he can’t, because only your browser has the cookies and credentials that yourbank.com requires in order to return protected content.
Each of the “deny read” restrictions is fairly simply in theory, but as the web platform gets ever-richer, the challenge in enforcing these restrictions continues to grow.
If a getPixel(x,y) method existed, an attacker could embed a frame, image, or video from a victim site, then use the getPixel method to read all of the pixels out of it, allowing reconstruction of the contents. The attacking page could then send this data back to the bad guy’s server.
Unless the browser implementer takes care to block this attack, an attacker could render another origin’s images onto a canvas, then use toDataURL to steal copies of those cross-origin resources. To block the attack, the browser must keep track of any use of cross-origin images and block any subsequent calls to the canvas’ toDataURL() function with a Security Exception. The browser must also ensure that this protection isn’t circumvented by redirects; for instance, where an attacker instructs the canvas to draw http://a.com/safe.png, but that URL returns a HTTP/302 redirection to http://b.com/victim.png.
That’s it for now; see you in Part 2: Limited Write.
As a security model, it sure looks much too complicated to be foolproof!
How is the getPixel problem handled in Flash & Silverlight, compared to canvas?
@L: I don't know what Flash does, as I know very little about that platform.
For Silverlight, I've been told "Silverlight's WriteableBitmap will actually refuse to give you the pixels if you're snapping an element with an image that came from some other domain."
This came up recently because someone is experimenting with a Canvas implementation in Silverlight: http://blogs.msdn.com/delay/archive/2009/08/24/using-one-platform-to-build-another-html-5-s-canvas-tag-implemented-using-silverlight.aspx
This is easily the best explanation of SOP I've come across. Thanks much! I'll definitely be sharing this with my peers.
Aseem (former IE intern if you remember =D)
Interesting thread from Mozilla arguing for same-origin requirements for video: http://lists.xiph.org/pipermail/theora/2008-November/001958.html
Very straightforward explanation! Any chance to see Part 2 of this article?
thanks for that explanation. first sensible one I've seen so far. I'm currently hating SOP as as it doesnt even seem to be possible to access images from my own subdomain... I guess redirecting a subdomain to a different site could be otherwise used maliciously. still very annoying, limiting images to exactly the domain where my page is downloaded from.
@Marc: How specifically are you trying to "access images"? It's entirely legal to have an IMG tag pointing pretty much anywhere. If, however, you were to try to render a cross-origin image onto a CANVAS tag, you'd find that CANVAS tag gets flagged "origin unclean" and you won't be able to call toDataURL() on it anymore. (If you could, you could "spy on" the pixels of a cross-domain image).
I have a question:
I have a video tag and the source of the video being in the same folder as the page that displays it. I also have a canvas on that page and by clicking a button I draw the video's image at a specific frame onto the canvas. When I try and execute the canvas's method toDataURL() after I have drawn the image from the video to the canvas (ctxDraw.drawImage(video, 0, 0, w, h);) I get the following: dom exception: Security_ERR(18). I would have thought that because the video is local that it should allow me to gather the pixel data and should keep the canvas's origin-clean flag clean?
Your help and advise on this matter will be greatly appreciated as I have been struggling with it for a while.
@Brad: Which browser? IE9 Beta had a too-strict limit there, which was fixed for IE9 RC to match the standard. I don't know if other browsers match the spec. If you have a repro URL, I'm happy to look.
Excellent article. I have a question:
I have the scenario where the main HTML page is loaded from domain "A" and there is an iFrame created which loads the page from Domain "B". Domain A and Domain B are different but are known and can trust each other. My requirement is to read the DOM of page from domain "B". Is it possible ? How do I do that ?
I am using IE9, Windows 7
@Amit: No, this is not possible, as explained in the article above. If you want to communicate between these two frames, use the postMessage API.