What is cross-site scripting?
by Chad Perrin, TechRepublic
Cross-site scripting, also known as "XSS," is a class of security exploit that has gotten a fair bit of attention in the last few years. Many users, and even Web developers, aren't entirely clear on what the term means, however. I'll explain cross-site scripting for you, so you will know where the dangers lie.
Defining cross-site scripting
JavaScript is a powerful tool for developing rich Web applications. Without client-side execution of code embedded in HTML and XHTML pages, the dynamic nature of Web applications like Google Maps (http://maps.google.com/), Try Ruby! (http://tryruby.hobix.com/) and Zoho Office (http://www.zoho.com/) would not be possible. Unfortunately, any time you add complexity to a system, you increase the potential for security issues -- and adding JavaScript to a Web page is no exception.
Among the problems introduced by JavaScript are:
- A malicious Web site might employ JavaScript to make changes to the local system, such as copying or deleting files.
- A malicious Web site might employ JavaScript to monitor activity on the local system, such as with keystroke logging.
- A malicious Web site might employ JavaScript to interact with other Web sites the user has open in other browser windows or tabs.
The first and second problems in the above list can be mitigated by
turning the browser into a sort of "sandbox" that limits the way
JavaScript is allowed to behave so that it only works within the browser's
little world. The third can be limited somewhat as well, but it is all too
easy to get around that limitation because whether a particular Web page
can interact with another Web page in a given manner may not be something
that can be controlled by the software employed by the end user.
Sometimes, the ability of one Web site's JavaScript to steal data meant
for another Web site can only be limited by the due diligence of the other
Web site's developers.
The key to defining cross-site scripting is in the fact that vulnerabilities in a given Web site's use of dynamic Web design elements may give someone the opportunity to use JavaScript for security compromises. It's called "cross-site" because it involves interactions between two separate Web sites to achieve its goals. In many cases, however, even though the exploit involves the use of JavaScript, the Web site that's vulnerable to cross-site scripting exploits does not have to employ JavaScript itself at all. Only in the case of local cross-site scripting exploits does the vulnerability have to exist in JavaScript sent to the browser by a legitimate Web site.
Types of cross-site scripting
There are currently three major categories of cross-site scripting. Others may be discovered in the future, however, so don't think this sort of misuse of Web page vulnerability is necessarily limited to these three types.
Reflected:
Probably the most common type of cross-site scripting exploit is the reflected exploit. It targets vulnerabilities that occur in some Web sites when data submitted by the client is immediately processed by the server to generate results that are then sent back to the browser on the client system. An exploit is successful if it can send code to the server that is included in the Web page results sent back to the browser, and when those results are sent the code is not encoded using HTML special character encoding -- thus being interpreted by the browser rather than being displayed as inert visible text.
The most common way to make use of this exploit probably involves a link using a malformed URL, such that a variable passed in a URL to be displayed on the page contains malicious code. Something as simple as another URL used by the server-side code to produce links on the page, or even a user's name to be included in the text page so that the user can be greeted by name, can become a vulnerability employed in a reflected cross-site scripting exploit.
Stored:
Also known as HTML injection attacks, stored cross-site scripting exploits are those where some data sent to the server is stored (typically in a database) to be used in the creation of pages that will be served to other users later. This form of cross-site scripting exploit can affect any visitor to your Web site, if your site is subject to a stored cross-site scripting vulnerability. The classic example of this sort of vulnerability is content management software such as forums and bulletin boards where users are allowed to use raw HTML and XHTML to format their posts.
As with preventing reflected exploits, the key to securing your site against stored exploits is ensuring that all submitted data is translated to display entities before display so that it will not be interpreted by the browser as code.
Local:
A local cross-site scripting exploit targets vulnerabilities within the code of a Web page itself. These vulnerabilities are the result of incautious use of the Document Object Model in JavaScript so that opening another Web page with malicious JavaScript code in it at the same time might actually alter the code in the first page on the local system. In older versions of Internet Explorer (before IE 6 on MS Windows XP Service Pack 2), in fact, this could even be used on local Web pages (stored on the local computer rather than retrieved from the World Wide Web), and through those pages break out of the browser "sandbox" to affect the local system with the user privileges used to run the browser. Because most MS Windows users have tended to run everything as the Administrator account, this effectively meant that local cross-site scripting exploits on MS Windows before XP Service Pack 2 could do just about anything.
In a local cross-site scripting exploit, unlike reflected and stored exploits, no malicious code is sent to the server at all. The behaviour of the exploit takes place entirely on the local client system, but it alters the pages provided by the otherwise benign Web site before they are interpreted by the browser so that they behave as though they carried the malicious payload to the client from the server. This means that server-side protections that filter out or block malicious cross-site scripting will not work with this sort of exploit. For more about local cross-site scripting, see the explanation at DOM Based Cross Site Scripting (http://www.webappsec.org/projects/articles/071105.shtml).
Protection Against Cross-Site Scripting
The most comprehensive way to protect your Web design from being exploited by cross-site scripting is to translate any and all special characters in user-provided input -- even in URLs -- into display entities, such as HTML entities (http://www.w3schools.com/tags/ref_entities.asp). This applies not only to server-side code like PHP, Perl, and ASP.NET code, but also JavaScript that works with any user-provided input as well. This may interfere with the operation of Web sites where users expect to be able to use HTML and XHTML in their input, such as for Web site design helper applications -- in which case more complex code may be needed to protect against malicious code. Such fine-grained filtering is just one side of an arms race against malicious security crackers, however, and cannot reasonably be 100% effective.
Another way to protect your Web site from cross-site scripting exploits is to never directly use any user-provided input in your pages. Accepting a limited number of values in user-provided input that are each used as "keys," for lack of a better term, to choose from among certain predefined options is an example of how user-provided input can be used to define output, but obviously greatly limits the dynamic nature of Web applications. If your Web site does not need greater dynamism than this provides, however, this may be your safest option for generating output based on user input.
Similarly, input validation that simply strips out all characters unauthorised for specific, limited input types (such as removing everything but dashes, parentheses, periods, and digits from input expected to contain telephone numbers), or that rejects input containing unauthorised characters entirely, can be used. This is a useful technique for many forms of input, but not all. Such validation techniques should be used whenever possible, because they not only provide some protection against cross-site scripting, but also against direct attempts to compromise the server itself through buffer overflows, SQL injection, and other attempts to exceed the bounds of the system.
Cookies are often used to provide some form of security against cross-site scripting. Many cross-site scripting exploits are designed to "steal" session cookies, but a cookie can be "tied" to a particular IP address so that hijacked cookies fail validation when employed by cross-site scripting exploits. There are potential work-arounds for this sort of security, such as when the legitimate user of a given cookie and a cross-site scripting exploit both originate from behind the same proxy server or Network Address Translation (NAT) device, of course. Internet Explorer implements an HTTPOnly flag that prevents local scripts from affecting a cookie to try to guard against this sort of cookie abuse, but it is ineffective against cross-site request forgery attacks, where unintended requests may be sent via cross-site scripting exploits alongside a cookie used to authorise the requests at the server.
The single most effective means of avoiding cross-site scripting in Web development, however, is to design your Web site so that it does not require client-side code at all. That way, if your users want to turn off the JavaScript interpreters in their browsers, they can do so without losing the ability to make use of your Web site. This does not protect against all forms of potential malicious input to your server, of course, and it does not actually limit the vulnerability of your Web site all by itself -- but it does give visitors to your Web site the option of protecting themselves.