This tests the middleware, using a basically static site and applying themes.

First we'll setup the site, using urlmap:

    >>> from paste.urlmap import URLMap
    >>> from webob import Request, Response
    >>> app = URLMap()

A theme:

    >>> app['/theme.html'] = Response('''\
    ... <html>
    ...  <head>
    ...   <title>This is a theme title</title>
    ...   <link rel=Stylesheet type="text/css" href="style.css">
    ...   <style type="text/css">
    ...     @import "style2.css";
    ...   </style>
    ...  </head>
    ...  <body>
    ... 
    ...   <div id="header" class="title-bar">
    ...     <h1 id="title">This is the theme title</h1>
    ...     <div class="topnav"></div>
    ...   </div>
    ...   <div id="content-wrapper">
    ...     <a name="top"></a>
    ...     <div id="content">
    ...       This content will be replaced.
    ...     </div>
    ...     <a href="#top">Back to top</a>
    ...   </div>
    ... 
    ...   <div id="footer">
    ...     <span id="copyright">Copyright (C)</span> 2000 Some Corporation
    ...   </div>
    ... 
    ...  </body>
    ... </html>''')

The rule xml:

    >>> rule_xml = '''\
    ... <ruleset>
    ...   <match path="/blog" class="blog" />
    ...   <match path="exact:/about.html" class="breakout" />
    ...   <match request-header="X-No-Deliverate: boolean:true" abort="1" />
    ...   <match response-header="X-No-Deliverate: boolean:true" abort="1" />
    ...   <match environ="wsgi.url_scheme: https" class="via-https" />
    ...   <theme href="/theme.html" />
    ...   <match path="exact:/magic" class="magic" />
    ...   <rule class="magic">
    ...     <prepend content="elements:/html/head/meta"
    ...              theme="children:/html/head"/>
    ...   </rule>
    ...   <rule class="default">
    ...     <replace content="children:#footer" theme="children:#footer" nocontent="ignore" />
    ...     <replace content="children:body" theme="children:#content" nocontent="abort" />
    ...   </rule>
    ...   <rule class="breakout">
    ...     <replace content="children:#footer" theme="children:#footer" nocontent="ignore" />
    ...     <replace content="children:body" theme="children:#content-wrapper" nocontent="abort" />
    ...   </rule>
    ...   <rule class="blog">
    ...     <drop theme="#copyright" if-content="#cc" />
    ...     <drop theme="tag:#copyright" notheme="ignore" />
    ...     <drop content="#cc" nocontent="ignore" />
    ...     <replace content="children:#content" theme="children:#content" nocontent="abort" />
    ...   </rule>
    ... </ruleset>'''

Rule files can be published and fetched with a subrequest:

    >>> app['/mytheme/rules.xml'] = Response(rule_xml, content_type="application/xml")

Rule files can also be read directly from the filesystem. Here's one:

    >>> import tempfile
    >>> rule_filename_pos, rule_filename = tempfile.mkstemp()
    >>> f = open(rule_filename, 'w+')
    >>> f.write(rule_xml)
    >>> f.close()

Now let's set up some pages for Deliverance to work with:

    >>> app['/blog/index.html'] = Response('''\
    ... <html><head><title>A blog post</title>
    ... <link href="rss.xml" rel="alternate" type="application/rss+xml" title="RSS Feed" />
    ... </head>
    ... <body>
    ... Some junk
    ... <div id="content">the blog post <b>with some style</b></div>
    ... some more junk
    ... <div id="footer">a footer that will be ignored</div>
    ... <div id="cc">Creative Commons License</div>
    ... </body></html>
    ... ''')
    >>> app['/about.html'] = Response('''\
    ... <html><title>About this site</title></html>
    ... <body>
    ... This is all about this site.
    ... <div id="footer">a footer that will be ignored</div>
    ... </body></html>
    ... ''')
    >>> app['/magic'] = Response('''\
    ... <html><head></head><body>A simple page</body></html>''')
    >>> app['/magic'].headers['x-no-deliverate'] = '1'
    >>> app['/magic2'] = Response('''\
    ... <html><head><meta http-equiv="x-no-deliverate" content="1" /></head><body>something</body></html>''')

We'll set up one DeliveranceMiddleware using the published rules, and
another using the rule file:

    >>> from deliverance.middleware import DeliveranceMiddleware, FileRuleGetter, SubrequestRuleGetter
    >>> from deliverance.log import PrintingLogger
    >>> import logging
    >>> deliv_filename = DeliveranceMiddleware(app, FileRuleGetter(rule_filename),
    ...                                     PrintingLogger,
    ...                                     log_factory_kw=dict(print_level=logging.WARNING))
    >>> deliv_url = DeliveranceMiddleware(app, SubrequestRuleGetter('http://localhost/mytheme/rules.xml'),
    ...                               PrintingLogger,
    ...                               log_factory_kw=dict(print_level=logging.WARNING))

Now let's look at some plain content and its deliverated equivalent. Here's
a helper function to make the content easy to compare:

    >>> def compare_request(path, deliv):
    ...     # work around WebOb bug fixed here:
    ...     # http://bitbucket.org/ianb/webob/changeset/b8671bd53cf4/
    ...     app[path].body = app[path].body
    ...
    ...     raw_res = Request.blank(path).get_response(app)
    ...     result = 'Original content:\n' + raw_res.body.strip()
    ...     themed_res = Request.blank(path).get_response(deliv)
    ...     result += '\nThemed content:\n' + themed_res.body.strip()
    ...     return result

First we'll look at the blog, fairly simple:

    >>> print compare_request('/blog/index.html', deliv_filename) # doctest: +REPORT_UDIFF
    Original content:
    <html><head><title>A blog post</title>
    <link href="rss.xml" rel="alternate" type="application/rss+xml" title="RSS Feed" />
    </head>
    <body>
    Some junk
    <div id="content">the blog post <b>with some style</b></div>
    some more junk
    <div id="footer">a footer that will be ignored</div>
    <div id="cc">Creative Commons License</div>
    </body></html>
    Themed content:
    <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
    <html><head><link href="rss.xml" rel="alternate" type="application/rss+xml" title="RSS Feed"><title>A blog post</title><link rel="Stylesheet" type="text/css" href="http://localhost/style.css"><style type="text/css">
        @import "http://localhost/style2.css";
      </style></head><body>
    <BLANKLINE>
      <div id="header" class="title-bar">
        <h1 id="title">This is the theme title</h1>
        <div class="topnav"></div>
      </div>
      <div id="content-wrapper">
        <a name="top"></a>
        <div id="content">the blog post <b>with some style</b></div>
        <a href="#top">Back to top</a>
      </div>
    <BLANKLINE>
      <div id="footer">
         2000 Some Corporation
      </div>
    <BLANKLINE>
     </body></html>

Should be the same in both cases:

    >>> first = compare_request('/blog/index.html', deliv_filename)
    >>> second = compare_request('/blog/index.html', deliv_url)
    >>> first == second
    True

Now the about page, with its breakout style:

    >>> print compare_request('/about.html', deliv_url) # doctest: +REPORT_UDIFF
    Original content:
    <html><title>About this site</title></html>
    <body>
    This is all about this site.
    <div id="footer">a footer that will be ignored</div>
    </body></html>
    Themed content:
    <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
    <html><head><title>About this site</title><link rel="Stylesheet" type="text/css" href="http://localhost/style.css"><style type="text/css">
        @import "http://localhost/style2.css";
      </style></head><body>
    <BLANKLINE>
      <div id="header" class="title-bar">
        <h1 id="title">This is the theme title</h1>
        <div class="topnav"></div>
      </div>
      <div id="content-wrapper">
    This is all about this site.
    </div>
    <BLANKLINE>
      <div id="footer">a footer that will be ignored</div>
    <BLANKLINE>
     </body></html>

Now the magic response, which shouldn't get themed at all:

    >>> print compare_request('/magic', deliv_filename)
    Original content:
    <html><head></head><body>A simple page</body></html>
    Themed content:
    <html><head></head><body>A simple page</body></html>
    >>> print compare_request('/magic2', deliv_url)
    Original content:
    <html><head><meta http-equiv="x-no-deliverate" content="1" /></head><body>something</body></html>
    Themed content:
    <html><head><meta http-equiv="x-no-deliverate" content="1" /></head><body>something</body></html>

Deliverance should not blow up if the content response is empty:

    >>> app['/empty'] = Response('')
    >>> print compare_request('/empty', deliv_filename)
    Original content:
    <BLANKLINE>
    Themed content:
    <BLANKLINE>

Let's also make sure Deliverance correctly preserves HTML entities:

    >>> app['/html'] = Response("One &hellip; two")
    >>> print compare_request('/html', deliv_filename) # doctest: +REPORT_UDIFF
    Original content:
    One &hellip; two
    Themed content:
    <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
    <html><head><title>This is a theme title</title><link rel="Stylesheet" type="text/css" href="http://localhost/style.css"><style type="text/css">
        @import "http://localhost/style2.css";
      </style></head><body>
    <BLANKLINE>
      <div id="header" class="title-bar">
        <h1 id="title">This is the theme title</h1>
        <div class="topnav"></div>
      </div>
      <div id="content-wrapper">
        <a name="top"></a>
        <div id="content"><p>One &#8230; two</p></div>
        <a href="#top">Back to top</a>
      </div>
    <BLANKLINE>
      <div id="footer">
        <span id="copyright">Copyright (C)</span> 2000 Some Corporation
      </div>
    <BLANKLINE>
     </body></html>

When you are using a rule file from the filesystem, it will not be re-read
on every request by default. So if we change the rules on the filesystem
and make a request through the existing DeliveranceMiddleware, there will be
no change.

Here we'll remove the theming rules for the blog:

    >>> new_rule_xml = '''\
    ... <ruleset>
    ...   <match path="exact:/about.html" class="breakout" />
    ...   <match request-header="X-No-Deliverate: boolean:true" abort="1" />
    ...   <match response-header="X-No-Deliverate: boolean:true" abort="1" />
    ...   <match environ="wsgi.url_scheme: https" class="via-https" />
    ...   <theme href="/theme.html" />
    ...   <rule class="default">
    ...     <replace content="children:#footer" theme="children:#footer" nocontent="ignore" />
    ...     <replace content="children:body" theme="children:#content" nocontent="abort" />
    ...   </rule>
    ...   <rule class="breakout">
    ...     <replace content="children:#footer" theme="children:#footer" nocontent="ignore" />
    ...     <replace content="children:body" theme="children:#content-wrapper" nocontent="abort" />
    ...   </rule>
    ... </ruleset>'''
    >>> f = open(rule_filename, 'w+')
    >>> f.write(new_rule_xml)
    >>> f.close()

    >>> print compare_request('/blog/index.html', deliv_filename) # doctest: +REPORT_UDIFF
    Original content:
    <html><head><title>A blog post</title>
    <link href="rss.xml" rel="alternate" type="application/rss+xml" title="RSS Feed" />
    </head>
    <body>
    Some junk
    <div id="content">the blog post <b>with some style</b></div>
    some more junk
    <div id="footer">a footer that will be ignored</div>
    <div id="cc">Creative Commons License</div>
    </body></html>
    Themed content:
    <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
    <html><head><link href="rss.xml" rel="alternate" type="application/rss+xml" title="RSS Feed"><title>A blog post</title><link rel="Stylesheet" type="text/css" href="http://localhost/style.css"><style type="text/css">
        @import "http://localhost/style2.css";
      </style></head><body>
    <BLANKLINE>
      <div id="header" class="title-bar">
        <h1 id="title">This is the theme title</h1>
        <div class="topnav"></div>
      </div>
      <div id="content-wrapper">
        <a name="top"></a>
        <div id="content">the blog post <b>with some style</b></div>
        <a href="#top">Back to top</a>
      </div>
    <BLANKLINE>
      <div id="footer">
         2000 Some Corporation
      </div>
    <BLANKLINE>
     </body></html>

However, if we set always_reload=True in the FileRuleGetter, the rules
will be re-read from the file on every request:

    >>> f = open(rule_filename, 'w+')
    >>> f.write(rule_xml)
    >>> f.close()
    >>> deliv_filename = DeliveranceMiddleware(app,
    ...                                     FileRuleGetter(rule_filename, always_reload=True),
    ...                                     PrintingLogger,
    ...                                     log_factory_kw=dict(print_level=logging.WARNING))

    >>> print compare_request('/blog/index.html', deliv_filename) # doctest: +REPORT_UDIFF
    Original content:
    <html><head><title>A blog post</title>
    <link href="rss.xml" rel="alternate" type="application/rss+xml" title="RSS Feed" />
    </head>
    <body>
    Some junk
    <div id="content">the blog post <b>with some style</b></div>
    some more junk
    <div id="footer">a footer that will be ignored</div>
    <div id="cc">Creative Commons License</div>
    </body></html>
    Themed content:
    <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
    <html><head><link href="rss.xml" rel="alternate" type="application/rss+xml" title="RSS Feed"><title>A blog post</title><link rel="Stylesheet" type="text/css" href="http://localhost/style.css"><style type="text/css">
        @import "http://localhost/style2.css";
      </style></head><body>
    <BLANKLINE>
      <div id="header" class="title-bar">
        <h1 id="title">This is the theme title</h1>
        <div class="topnav"></div>
      </div>
      <div id="content-wrapper">
        <a name="top"></a>
        <div id="content">the blog post <b>with some style</b></div>
        <a href="#top">Back to top</a>
      </div>
    <BLANKLINE>
      <div id="footer">
         2000 Some Corporation
      </div>
    <BLANKLINE>
     </body></html>

    >>> f = open(rule_filename, 'w+')
    >>> f.write(new_rule_xml)
    >>> f.close()

    >>> print compare_request('/blog/index.html', deliv_filename) # doctest: +REPORT_UDIFF
    Original content:
    <html><head><title>A blog post</title>
    <link href="rss.xml" rel="alternate" type="application/rss+xml" title="RSS Feed" />
    </head>
    <body>
    Some junk
    <div id="content">the blog post <b>with some style</b></div>
    some more junk
    <div id="footer">a footer that will be ignored</div>
    <div id="cc">Creative Commons License</div>
    </body></html>
    Themed content:
    <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
    <html><head><link href="rss.xml" rel="alternate" type="application/rss+xml" title="RSS Feed"><title>A blog post</title><link rel="Stylesheet" type="text/css" href="http://localhost/style.css"><style type="text/css">
        @import "http://localhost/style2.css";
      </style></head><body>
    <BLANKLINE>
      <div id="header" class="title-bar">
        <h1 id="title">This is the theme title</h1>
        <div class="topnav"></div>
      </div>
      <div id="content-wrapper">
        <a name="top"></a>
        <div id="content">
    Some junk
    <div id="content">the blog post <b>with some style</b></div>
    some more junk
    <div id="cc">Creative Commons License</div>
    </div>
        <a href="#top">Back to top</a>
      </div>
    <BLANKLINE>
      <div id="footer">a footer that will be ignored</div>
    <BLANKLINE>
     </body></html>

The content's DOCTYPE should be respected. So if the content's DOCTYPE is XHTML,
the merged output should preserve that DOCTYPE, and self-closing tags should be
preserved rather than being rewritten as unclosed tags. Let's see:

    >>> app['/magic'].body = """<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    ... "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    ... <html><head>
    ... <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    ... <meta http-equiv="refresh" content="5" />
    ... </head><body> 
    ... <img src="foo.png" /> 
    ... A simple page</body></html>"""
    >>> del app['/magic'].headers['x-no-deliverate']

Now let's see what happens when Deliverance is done with it:

    >>> print compare_request('/magic', deliv_filename) # doctest: +REPORT_UDIFF
    Original content:
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html><head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    </head><body> 
    <img src="foo.png" /> 
    A simple page</body></html>
    Themed content:
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml"><head><title>This is a theme title</title><link rel="Stylesheet" type="text/css" href="http://localhost/style.css" /><style type="text/css">
        @import "http://localhost/style2.css";
      </style>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
      </head><body>
    <BLANKLINE>
      <div id="header" class="title-bar">
        <h1 id="title">This is the theme title</h1>
        <div class="topnav"></div>
      </div>
      <div id="content-wrapper">
        <a name="top" id="top"></a>
        <div id="content"> 
    <img src="foo.png" /> 
    A simple page</div>
        <a href="#top">Back to top</a>
      </div>
    <BLANKLINE>
      <div id="footer">
        <span id="copyright">Copyright (C)</span> 2000 Some Corporation
      </div>
    <BLANKLINE>
     </body></html>

It worked. There's a new id="top" attribute on the <a name="top"> tag;
it's required for XHTML: http://www.w3.org/TR/xhtml1/#h-4.10

In the above example note also that meta Content-Type tags are dropped
even though we have a rule to preserve them -- I think that's related
to the following lxml.html behavior:

   >>> from lxml.html import tostring, fromstring
   >>> x="""<html><head><meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"></meta></head></html>"""
   >>> tostring(fromstring(x))
   '<html><head></head></html>'
   >>> tostring(fromstring(x), include_meta_content_type=True)
   '<html><head><meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"></head></html>'
