IEInternals

A look at Internet Explorer from the inside out. @EricLaw left Microsoft in 2012, but was named an IE MVP in '13 & an IE userAgent (http://useragents.ie) in '14

Same Origin Policy Part 1: No Peeking

Same Origin Policy Part 1: No Peeking

Rate This
  • Comments 11

Despite its role as the cornerstone of web application security, it’s clear that many (most?) web professionals do not understand Same Origin Policy (SOP), or hold one or more misconceptions about what SOP requires.

It’s a big topic, and I don’t plan to address it all on this quiet Friday morning. This post will be the first in a multi-part series about SOP and what it means for the web platform.

So, what’s an Origin anyway?

In order to understand SOP, you must first understand what an origin is. For the purposes of this post, I’ll simply give the simplified explanation that an origin is a string consisting of the protocol/scheme and fully qualified hostname of a given piece of content. A webpage from http://www.example.com/a.htm has the origin “http,www.example.com”.

The reality is a bit more complicated than that; in every browser except Internet Explorer, the origin includes the server’s port (if specified), while in IE, the content’s Security Zone is a part of the origin instead. However, neither of these is relevant in most cases.

Common Misconceptions

Over the years, I’ve heard many incorrect statements about SOP.  For instance: “SOP means that one page cannot use resources from a different server” (wrong!) or “SOP means that one page cannot send data to a different server” (wrong!) or “SOP means that pages from one site are completely immune to tampering by other sites” (wrong!).

Each of these myths has a grain of truth in it, but with Same Origin Policy, the devil is most definitely in the details. Virtually every restriction mentioned in this blog post has one or more exceptions, but we’ll start with the basics and leave the exceptions to future posts.

The “Read Write Execute” Mental Model

While my heritage is DOS, not Unix, I like to think of Same Origin Policy using the Read, Write, Execute (RWX) permissions model commonly used in Unix filesystems (and NTFS, of course). In this model, a user/process is granted permission to perform zero or more of these three operations on a given file.

I like explaining SOP using the RWX model because it is familiar to many technical people. The simplest explanation of SOP is that Origin “A” has the following permissions:

  • Read of resources from Origin “B”: Deny
  • Write to Origin “B”: Limit
  • Execute of resources from Origin “B”: Allow

Now, each of these points has subtleties that I will address in future posts, but the rest of today’s post is about the first one: Same Origin Policy means that pages from Origin “A” may not read the contents of resources in Origin “B”.

What does “Deny Read” mean exactly?

In this model, “deny read” means that script running in Origin “A” must not be permitted to utilize content from Origin “B” in such a way that the resource from “B” can be effectively reconstructed by “A". So, for instance, a webpage from Origin “A”:

  1. May execute a script from “B”
  2. Must not be permitted to get the raw sourcecode of that script
  3. May apply (execute) a CSS stylesheet from “B”
  4. Must not be permitted to get the raw-text of that stylesheet
  5. May include (execute) a frame pointed at a HTML page from “B”
  6. Must not be permitted to get the inner HTML of that frame
  7. May draw (execute) an image from “B”
  8. Must not be permitted to examine the bits of that image
  9. May play (execute) a video from “B”
  10. Must not be permitted to reconstruct the video by capturing images of it

…and so on.

What’s the big deal? Why Deny Read?

If content from one origin was able to read content loaded from another origin, one site could easily attack another site. For instance, an IFRAME from attacker.com could read the contents of another IFRAME from yourbank.com. And so forth. The number and scope of attacks against sensitive resources would only be limited by the attacker’s imagination.

Couldn’t the attacker’s server simply make a direct request to the victim server?

A key point in all of this is that abusing the user’s browser to load content from the victim server sends that user’s authentication (cookies, credentials, etc) to the victim server. The attacker needs these credentials to be sent to the victim server in order to get access to content worth stealing.

Stated another way, if the attacker could directly download protected resources from yourbank.com without using your browser, he absolutely would do so. But he can’t, because only your browser has the cookies and credentials that yourbank.com requires in order to return protected content.

Complexity Kills

Each of the “deny read” restrictions is fairly simply in theory, but as the web platform gets ever-richer, the challenge in enforcing these restrictions continues to grow.

Let’s start with a simple one: someone once proposed that JavaScript should offer a getPixel(x, y) method. This method would take a point on the screen and return the color of the pixel at that point. Such a method would be useful for a variety of purposes, but if you think about it for just a moment, it’s clear that such a method plainly violates rules #8 and #10, and probably also violates rules #2, #4, #6, albeit indirectly.

If a getPixel(x,y) method existed, an attacker could embed a frame, image, or video from a victim site, then use the getPixel method to read all of the pixels out of it, allowing reconstruction of the contents. The attacking page could then send this data back to the bad guy’s server.

So, JavaScript does not offer a getPixel method. However, other features that suffer from the same threats do exist. Consider the HTML5 <canvas> element. The canvas object offers the ability to draw an image from an arbitrary origin onto itself. It also offers a toDataURL() function that allows serialization of the canvas’ contents to a data URL.

Unless the browser implementer takes care to block this attack, an attacker could render another origin’s images onto a canvas, then use toDataURL to steal copies of those cross-origin resources. To block the attack, the browser must keep track of any use of cross-origin images and block any subsequent calls to the canvas’ toDataURL() function with a Security Exception. The browser must also ensure that this protection isn’t circumvented by redirects; for instance, where an attacker instructs the canvas to draw http://a.com/safe.png, but that URL returns a HTTP/302 redirection to http://b.com/victim.png.

Other features or plugins may offer similar functionality, and subtlety abounds. For instance, consider what would happen if a JavaScript did have a getPixel() method, but it was restricted to be available only to same-origin IFRAMES. An attacker may try to circumvent this restriction by layering an IFRAME with 1% opacity over a victim frame. Unless the browser implementer took care to prevent this attack, an attacker could effectively steal an image of a cross-origin frame using this technique.

That’s it for now; see you in Part 2: Limited Write.

-Eric

  • As a security model, it sure looks much too complicated to be foolproof!

    How is the getPixel problem handled in Flash & Silverlight, compared to canvas?

  • @L: I don't know what Flash does, as I know very little about that platform.

    For Silverlight, I've been told "Silverlight's WriteableBitmap will actually refuse to give you the pixels if you're snapping an element with an image that came from some other domain."

    This came up recently because someone is experimenting with a Canvas implementation in Silverlight: http://blogs.msdn.com/delay/archive/2009/08/24/using-one-platform-to-build-another-html-5-s-canvas-tag-implemented-using-silverlight.aspx

  • Hi Eric,

    This is easily the best explanation of SOP I've come across. Thanks much! I'll definitely be sharing this with my peers.

    Best,

    Aseem (former IE intern if you remember =D)

  • Interesting thread from Mozilla arguing for same-origin requirements for video: http://lists.xiph.org/pipermail/theora/2008-November/001958.html

  • Very straightforward explanation! Any chance to see Part 2 of this article?

  • thanks for that explanation. first sensible one I've seen so far. I'm currently hating SOP as as it doesnt even seem to be possible to access images from my own subdomain... I guess redirecting a subdomain to a different site could be otherwise used maliciously. still very annoying, limiting images to exactly the domain where my page is downloaded from.

  • @Marc: How specifically are you trying to "access images"? It's entirely legal to have an IMG tag pointing pretty much anywhere. If, however, you were to try to render a cross-origin image onto a CANVAS tag, you'd find that CANVAS tag gets flagged "origin unclean" and you won't be able to call toDataURL() on it anymore. (If you could, you could "spy on" the pixels of a cross-domain image).

  • Hi Eric,

    Brilliant article.

    I have a question:

    I have a video tag and the source of the video being in the same folder as the page that displays it. I also have a canvas on that page and by clicking a button I draw the video's image at a specific frame onto the canvas. When I try and execute the canvas's method toDataURL() after I have drawn the image from the video to the canvas (ctxDraw.drawImage(video, 0, 0, w, h);) I get the following: dom exception: Security_ERR(18). I would have thought that because the video is local that it should allow me to gather the pixel data and should keep the canvas's origin-clean flag clean?

    Your help and advise on this matter will be greatly appreciated as I have been struggling with it for a while.

  • @Brad: Which browser? IE9 Beta had a too-strict limit there, which was fixed for IE9 RC to match the standard. I don't know if other browsers match the spec. If you have a repro URL, I'm happy to look.

  • Excellent article. I have a question:

    I have the scenario where the main HTML page is loaded from domain "A" and there is an iFrame created which loads the page from Domain "B". Domain A and Domain B are different but are known and can trust each other. My requirement is to read the DOM of page from domain "B". Is it possible ? How do I do  that ?

    I am using IE9, Windows 7

  • @Amit: No, this is not possible, as explained in the article above. If you want to communicate between these two frames, use the postMessage API.

Page 1 of 1 (11 items)
Leave a Comment
  • Please add 6 and 2 and type the answer here:
  • Post