发新话题
打印

[转载]Understanding Malicious Content Mitigation for Web Developers

[转载]Understanding Malicious Content Mitigation for Web Developers

信息来源:www.cert.org

Understanding Malicious Content Mitigation for Web Developers

CERT Advisory CA-2000-02 describes a problem with malicious tags embedded in client HTTP requests, discusses the impact of malicious scripts, and offers ways to prevent the insertion of malicious tags.

This tech tip, written for web developers, describes more specifically the steps you can take to prevent attackers from from using untrusted content to exploit your web site.

This document has the following sections:

Problem Summary

Web pages contain both text and HTML markup that is generated by the server and interpreted by the client browser. Servers that generate static pages have full control over how the client will interpret the pages sent by the server. However, servers that generate dynamic pages do not have complete control over how their output is interpreted by the client. The heart of the issue is that if untrusted content can be introduced into a dynamic page, neither the server nor the client has enough information to recognize that this has happened and take protective actions.

In HTML, to distinguish text from markup, some characters are treated specially. The grammar of HTML determines the significance of "special" characters -- different characters are special at different points in the document. For example, the less-than sign "<" typically indicates the beginning of an HTML tag. Tags can either affect the formatting of the page or introduce a program that the browser executes (e.g., the

  • Server-side scripts
  • Other possibilities
  • It is important to note that individual situations may warrant including additional characters in the list of special characters. Web developers must examine their applications and determine which characters can affect their web applications.

    Encoding Dynamic Output Elements

    Each character in the ISO-8859-1 specification can be encoded using its numeric entry value. A complete description of the ISO-8859-1 specification can be found in the appendix of this document.

    The following example uses the copyright mark in an HTML document:

    © 2000 Some Co., Inc.

    The copyright character is 169 and using the &# syntax allows the author to insert encoded characters that will be interpreted by the browser.

    In addition, many of the ISO-8859-1 characters include an entity name encoding. The copyright can also be done using this method:

    © 2000 Some Co., Inc.

    Encoding untrusted data has benefits over filtering untrusted data, including the preservation of visual appearance in the browser. This is important when special characters are considered acceptable.

    Unfortunately, encoding all untrusted data can be resource intensive. Web developers must select a balance between encoding and the other option of data filtering.

    Filtering Dynamic Content

    Unfortunately, it is unclear whether there are any other characters or character combinations that can be used to expose other vulnerabilities. The recommended method is to select the set of characters that is known to be safe rather than excluding the set of characters that might be bad. For example, a form element that is expecting a person's age can be limited to the set of digits 0 through 9. There is no reason for this age element to accept any letters or other special characters. Using this positive approach of selecting the characters that are acceptable will help to reduce the ability to exploit other yet unknown vulnerabilities.

    The filtering process can be done as part of the data input process, the data output process, or both. Filtering the data during the output process, just before it is rendered as part of the dynamic page, is recommended. Done correctly, this approach ensures that all dynamic content is filtered. Filtering on the input side is less effective because dynamic content can be entered into a web sites database(s) via methods other than HTTP. In this case, the web server may never see the data as part of the input process. Unless the filtering is implemented in all places where dynamic data is entered, the data elements may still be remain tainted.

    Examine Cookies

    One method to exploit this vulnerability involves inserting malicious content into a cookie. Web developers should carefully examine cookies that they accept and use the filtering techniques describe above to verify that they are not storing malicious content.


    Sample Filtering Code

    C++ Example

    
    BYTE IsBadChar[] = {
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0xFF,0xFF,0x00,0x00,0xFF,0xFF,0xFF,0xFF,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0xFF,0xFF,0x00,0xFF,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
    0x00,0x00,0x00
    };
    
    DWORD FilterBuffer(BYTE  * pString,DWORD cChLen){
    	BYTE * pBad  = pString;
    	BYTE * pGood = pString;
    	DWORD i=0;
    	if (!pString) return 0;
    	for (i=0;pBad;i++){
    		if (!IsBadChar[pBad]) *pGood++ = pBad;
    	};	
    	return pGood-pString;
    }
    

    java script Example

    
    function RemoveBad(InStr){
        InStr = InStr.replace(/\/g,"");
        InStr = InStr.replace(/\"/g,"");
        InStr = InStr.replace(/\'/g,"");
        InStr = InStr.replace(/\%/g,"");
        InStr = InStr.replace(/\;/g,"");
        InStr = InStr.replace(/\(/g,"");
        InStr = InStr.replace(/\)/g,"");
        InStr = InStr.replace(/\&/g,"");
        InStr = InStr.replace(/\+/g,"");
        return InStr;
    }
    

    Perl Example

    
    #! The first function takes the negative approach. 
    #! Use a list of bad characters to filter the data
    sub FilterNeg {
        local( $fd ) = @_;
        $fd =~ s/[\<\>\"\'\%\;\)\(\&\+]//g;
        return( $fd ) ;
    }
    
    #! The second function takes the positive approach. 
    #! Use a list of good characters to filter the data
    sub FilterPos {
        local( $fd ) = @_;
        $fd =~ tr/A-Za-z0-9\ //dc;
        return( $fd ) ;
    }
    
    $Data = "This is a test string

    TOP

    发新话题