Welcome to MSDN Blogs Sign in | Join | Help

Better HTML parsing and validation with HtmlAgilityPack

Let's face it; sometimes the Microsoft.VisualStudio.TestTools.WebTesting.HtmlDocument class just doesn't cut it when you're writing custom extraction and validation code.  HtmlDocument was originally designed as an internal class to very efficiently parse URLs for dependent requests (such as images) out of HTML response bodies.  Before VS 2005 RTM, we made HtmlDocument part of the public WebTestFramework API, but scheduling and resource constraints prevented us from adding more general purpose DOM features like InnerHtml, InnerText, and GetElementById.  You could always parse the HTML string yourself, but fortunately there's a better option: HtmlAgilityPack.

HtmlAgilityPack is an open source project on CodePlex.  It provides standard DOM APIs and XPath navigation -- even when the HTML is not well-formed!

Here's a sample web test that uses the HtmlAgilityPack.HtmlDocument instead of the one in WebTestFramework.  It simply validates that Microsoft's home page lists Windows as the first item in the navigation sidebar.  Download HtmlAgilityPack and add a reference to it from your test project to try out this coded web test.

using System;

using System.Collections.Generic;

using System.Text;

using Microsoft.VisualStudio.TestTools.WebTesting;

using HtmlAgilityPack;

public class WebTest1Coded : WebTest

{

public override IEnumerator<WebTestRequest> GetRequestEnumerator()

{

WebTestRequest request1 = new WebTestRequest("http://www.microsoft.com/");

request1.ValidateResponse += new EventHandler<ValidationEventArgs>(request1_ValidateResponse);

yield return request1;

}

void request1_ValidateResponse(object sender, ValidationEventArgs e)

{

//load the response body string as an HtmlAgilityPack.HtmlDocument

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();

doc.LoadHtml(e.Response.BodyString);

//locate the "Nav" element

HtmlNode navNode = doc.GetElementbyId("Nav");

//pick the first <li> element

HtmlNode firstNavItemNode = navNode.SelectSingleNode(".//li");

//validate the first list item in the Nav element says "Windows"

e.IsValid = firstNavItemNode.InnerText == "Windows";

}

}



Updated: Fixed XPath query thanks to Oleg's comment.  Also fixed indention of the code.

Posted by JoshCh | 6 Comments

Check and modify the status of extraction or validation rules

The following came up on our internal discussion list today.  A user wanted to run an extraction rule and execute some different requests based on whether the rule succeeded or not.  The problem is that a failed extraction rule normally causes the web test to fail.  Fortunately, there's an easy way to check the success status of the rule, use that value later, and prevent the rule from failing the web test.

 

Let's say you have coded web test like the following:

 

        public override IEnumerator<WebTestRequest> GetRequestEnumerator()

        {

            WebTestRequest request1 = new WebTestRequest("http://vsnc/");

            request1.RecordedResponseUrl = "http://vsnc/";

            ExtractText extractionRule1 = new ExtractText();

            extractionRule1.StartsWith = "Logged in as ";

            extractionRule1.EndsWith = ".";

            extractionRule1.IgnoreCase = false;

            extractionRule1.UseRegularExpression = false;

            extractionRule1.Required = true;

            extractionRule1.Index = 0;

            extractionRule1.HtmlDecode = true;

            extractionRule1.ContextParameterName = "Name";

            request1.ExtractValues += new EventHandler<ExtractionEventArgs>(extractionRule1.Extract);

            yield return request1;

        }

 

Instead of hooking up the extraction rule directly to the ExtractValues event, you can prevent the web test from failing by using a custom event handler.  Your event handler can check and even modify values on the ExtractionEventArgs as shown below:

 

public override IEnumerator<WebTestRequest> GetRequestEnumerator()

{

    WebTestRequest request1 = new WebTestRequest("http://vsnc/");

    request1.RecordedResponseUrl = "http://vsnc/";

    request1.ExtractValues += new EventHandler<ExtractionEventArgs>(request1_ExtractValues);

    yield return request1;

 

    if ((bool)this.Context["LogInNameFound"] == true)

    {

        //do something, issue different requests, etc.

    }

}

 

void request1_ExtractValues(object sender, ExtractionEventArgs e)

{

    ExtractText extractionRule1 = new ExtractText();

    extractionRule1.StartsWith = "Logged in as ";

    extractionRule1.EndsWith = ".";

    extractionRule1.IgnoreCase = false;

    extractionRule1.UseRegularExpression = false;

    extractionRule1.Required = true;

    extractionRule1.Index = 0;

    extractionRule1.HtmlDecode = true;

    extractionRule1.ContextParameterName = "Name";

   

    //call the extraction rule directly

    extractionRule1.Extract(sender, e);

 

    //here's where I want to check or modify the success status of the rule

 

    if (e.Success)

    {

        //set a context parameter for use later in the web test

        this.Context["LogInNameFound"] = true;

    }

    else

    {

        //set a context parameter to indicate this rule failed

        this.Context["LogInNameFound"] = false;

 

        //force the rule to pass

        e.Success = true;

    }

}

 

As you can see, inserting your own event handler can give you more control over the execution of extraction and validation rules.

Posted by JoshCh | 3 Comments

Can I call a web test from a web test?

I've seen this question come up several times recently, so I'm going to try to provide the full answer here.  Let me start by saying that we have some significant changes in the pipeline that will make calling a web test from another web test a fully supported feature in a future release.  Until then, I do not recommend it due to the gotchas listed below.

  1. The web test engine will not load data sources in the called web test.  A possible workaround is to use custom databinding such as reading values directly from a database using ADO.NET.
  2. The WebTestContexts in the caller and callee web tests are not unified.  This can make it difficult to share values between the tests such as the current username or session ID.  This can also make it very difficult to use extraction rules since the rules will extract values into the caller's WebTestContext, but the callee will try to read the values out of its own WebTestContext.
  3. WebTestPlugins and PreWebTest/PostWebTest event handlers will not be called.
  4. WebTestRequestPlugins will not be called.  Note: WebTestRequest-level events such as PreRequest, ValidateResponse, ExtractValues, and PostRequest will still work.
  5. CreateTransaction() and EndTransaction() will have no effect.
  6. MoveDataTableCursor() will have no effect.

That's quite a list of gotchas, but calling a web test from another web test can still be done if you're not depending on databinding, transactions, the context, or plugins.

If you still want to call a web test from a web test, the code should look like this:

public override IEnumerator<WebTestRequest> GetRequestEnumerator()
{
    WebTestRequest request1 = new WebTestRequest("http://localhost/
");
    yield return request1;

    WebTest2Coded webTest2Coded = new WebTest2Coded();
    IEnumerator<WebTestRequest> webTest2Enumerator = webTest2Coded.GetRequestEnumerator();
    while (webTest2Enumerator.MoveNext())
    {
        yield return webTest2Enumerator.Current;
    }

    WebTestRequest request2 = new WebTestRequest("http://localhost/");
    yield return request2;
}

Again, I do not recommend calling web tests from other web tests in VS 2005.  It would be best to wait for the next release when it will be a fully supported feature.

Posted by JoshCh | 3 Comments

So you want to replay an IIS web server log?

A few months ago, a group in Microsoft wanted to be able to play back a large IIS log as a Visual Studio web test.  They started off with a converter that converted the IIS log into a gigantic coded web test.  The 118MB .cs file that resulted was a bit ridiculous and didn't perform very well at design time or run time.

I took a different approach by reading the IIS log from within the web test.  It depends on the the handy LogReader 2.2 download to handle all the log parsing and keep the code short and simple.

Here's a sample WebTest that plays back an IIS log:

public class IISLogCodedWebTest : WebTest
{
    public IISLogCodedWebTest()
    {
        this.PreAuthenticate = true;
    }

    public override IEnumerator<WebTestRequest> GetRequestEnumerator()
    {
        IISLogReader reader = new IISLogReader(@"d:\download\ex060209.log");
        foreach (WebTestRequest request in reader.GetRequests())
        {
            yield return request;
        }
    }
}

The code for the IISLogReader class used above is below:

using System;
using System.Collections.Generic;
using System.Text;
using MSUtil;
using LogQuery = MSUtil.LogQueryClassClass;
using IISLogInputFormat = MSUtil.COMIISW3CInputContextClassClass;
using LogRecordSet = MSUtil.ILogRecordset;
using Microsoft.VisualStudio.TestTools.WebTesting;

namespace IISLogToWebTest
{
    public class IISLogReader
    {
        private string _iisLogPath;

        public IISLogReader(string iisLogPath)
        {
            _iisLogPath = iisLogPath;
        }

        public IEnumerable<WebTestRequest> GetRequests()
        {
            LogQuery logQuery = new LogQuery();
            IISLogInputFormat iisInputFormat = new IISLogInputFormat();

            string query = @"SELECT s-ip, s-port, cs-method, cs-uri-stem, cs-uri-query FROM " + _iisLogPath;

            LogRecordSet recordSet = logQuery.Execute(query, iisInputFormat);
            while (!recordSet.atEnd())
            {
                ILogRecord record = recordSet.getRecord();
                if (record.getValueEx("cs-method").ToString() == "GET")
                {
                    string server = record.getValueEx("s-ip").ToString();
                    string path = record.getValueEx("cs-uri-stem").ToString();
                    string querystring = record.getValueEx("cs-uri-query").ToString();

                    StringBuilder urlBuilder = new StringBuilder();
                    urlBuilder.Append("http://");
                    urlBuilder.Append(server);
                    urlBuilder.Append(path);
                    if (!String.IsNullOrEmpty(querystring))
                    {
                        urlBuilder.Append("?");
                        urlBuilder.Append(querystring);
                    }

                    WebTestRequest request = new WebTestRequest(urlBuilder.ToString());
                    yield return request;
                }

                recordSet.moveNext();
            }
 
            recordSet.close();
        }
    }
}

Let me know if you find this useful or if you have any problems.

Posted by JoshCh | 6 Comments

Why can't I generate more load?

A common question we get goes something like this: "I'm running a 100 user load test and getting X RPS (requests per second).  When I add 500 more users, I'm still getting X RPS.  What's wrong?"

Here's a list of some things to check when you're not able to generate the load you expected:

  1. Is your web server CPU/Memory/Network maxed out?
  2. Is your load generating machine’s (VS machine or agents) CPU/Memory/Network maxed out?
  3. Is your database server’s (if one exists) CPU/Memory/Disk/Network maxed out?
  4. Do you have ThinkTime turned on in your load test?  This will limit the rate each “user” can submit requests.  For example, 5 seconds of ThinkTime per request will yield a maximum of 0.2 RPS per “user”.  Turn ThinkTime off for maximum load generation or use ThinkTime and increase the number of users for more realistic load generation.  The latter will generally require more memory.
  5. Make sure the Proxy properties on your web tests are not set to “default”.  This enables automatic proxy server detection which is VERY slow and will greatly reduce your maximum throughput.
  6. Don’t forget a load testing tool is designed to find bottlenecks in your application.  If you have pages with high response times due to a database or CPU bottleneck it will limit the number of requests each virtual user can issue per second.  Start out with a small amount of load and make sure response times stay reasonable as your ramp up the load.  Twenty users with ThinkTime shouldn’t see greater than 10 second response times, for example.  You can use the Response Time Goal property to set the maximum expected response time on each request.

The problem generally comes down to something on this list.  Do you have any tips that need to be added to this list?

Posted by JoshCh | 2 Comments

More load test bloggers

Ed Glas and Sean Lumley are now also blogging about the web/load testing tools in Visual Studio Team System.  Ed has several good posts up already.

Posted by JoshCh | 1 Comments

Bill Barnett, load test blogger

Bill Barnett, another member of the "Ocracoke" team (VS Team System's Web and Loading tools), is now blogging.  Take a look at the Advanced Load Testing Features article he posted.
Posted by JoshCh | 0 Comments

Web Test Authoring and Debugging Techniques

My whitepaper titled Web Test Authoring and Debugging Techniques is now live on MSDN!  It covers some best practices for creating web tests as well a lot of things to look for when things don't go the way you expected.

Please let me know what you think or if you have any questions.

Posted by JoshCh | 1 Comments

Custom ExtractionRule to extract form fields by index

*This is the third post in a series about web test extensibility points.  The first post was about extending web tests using custom IHttpBody classes and the second post was about a custom ValidationRule to catch redirects to error pages.*

The ExtractHiddenFields rule that is present in most web tests works by extracting every hidden field on a page into the web test context.  The naming convention used for these hidden fields is $HIDDENx.y where x is the context parameter name on ExtractHiddenFields (usually '1') and y is the hidden field name.  This convention works well for most web pages, but it can't distinguish between multiple hidden fields with the same name, like a page with multiple forms might have.  Fortunately, this scenario can be supported with the following custom extracton rule and some minor modifications to the web test.

using System;
using System.Collections;
using System.ComponentModel;
using Microsoft.VisualStudio.TestTools.WebTesting;

namespace OcracokeSamples {
    public class ExtractFormFieldWithIndex : ExtractionRule {
        private string _name;
        private int _index = 1;

        [Description("The name of the form field to extract.")]
        public string Name {
            get { return _name; }
            set { _name = value; }
        }

        [Description("The index of the form field.  For example, specifying '2' would " +
            "extract the value of the second form field with the given name.")]
        public int Index {
            get { return _index; }
            set { _index = value; }
        }

        public override string RuleName {
            get { return "Extract Form Field with Index"; }
        }

        public override string RuleDescription {
            get {
                return "Extracts a form field from a page based on the order it appears.  " +
                "Useful when a page contains multiple forms with fields of the same name.";
            }
        }

        public override void Extract(object sender, ExtractionEventArgs e) {
            //if the response is not HTML, display an error
            if (!e.Response.IsHtml) {
                e.Success = false;
                e.Message = "The response did not contain HTML.";
                return;
            }

            string formFieldValue = null;
            int currentIndex = 0;

            //examine each input tag
            foreach (HtmlTag tag in e.Response.HtmlDocument.GetFilteredHtmlTags("input")) {
                if (String.Equals(tag.GetAttributeValueAsString("name"), _name, StringComparison.OrdinalIgnoreCase)) {
                    currentIndex++;

                    if (currentIndex == _index) {
                        formFieldValue = tag.GetAttributeValueAsString("value");
                       
                        //if the form field was found, but had no value property, set the value to empty string
                        if (formFieldValue == null) {
                            formFieldValue = String.Empty;
                        }

                        break;
                    }
                }
            }

            if (formFieldValue != null) {
                e.WebTest.Context.Add(this.ContextParameterName, formFieldValue);
                e.Success = true;
                return;
            } else {
                e.Success = false;
                e.Message = String.Format("Form field named '{0}' with index '{1}' was not found.", _name, _index);
            }
        }
    }
}

If you read my earlier post about creating a custom validation rule, you'll reconize that creating a custom extraction rule is nearly identical.  You simply create a class that derives from ExtractionRule and implement your extraction logic in the Extract method.  The ExtractionEventArgs parameter to the Extract method provides access to the WebTestResponse containing all the response data received from the server.  Setting e.Success reports whether the rule passed or failed and e.Message allows you to provide an error message that will be displayed in the web test result viewer or the load test monitor.

To use a custom extraction rule in a web test, you must first let your test project know it exists.  If the rule class is included in the test project, just build the project and the rule will show up in the Add Extraction Rule dialog.  If, however, the rule is part of a separate class library project and potentially shared by multiple test projects, just build that class library project and add a reference to it in your test project(s).  Now when you right-click a request and select Add Extraction Rule, you should see your rule in the Add Extraction Rule dialog shown below.

Add Extraction Rule dialog

To use this extraction rule, you enter a form field name and specify an index to determine which occurrence of that form field you want to extract (1 for the first, 2 for the second, etc.).  Once you specify a context parameter name for the form field you're extracting and add this extraction rule to a request, you'll just need to update the parameter that corresponds to this form field.  Instead of $HIDDENx.y, bind the parameter's value to this new context parameter.

That's it.  Any questions about implementing custom extraction rules or using this sample rule?

Posted by JoshCh | 4 Comments

VSTS Dev & Test Tools MSDN Public Chat - Wed 10/19 @ 1pm EDT

Team System MSDN Public Chat

Visual Studio Team Edition for Software Developer & Visual Studio Team Edition for Software Testers
When: Wednesday, November 16th @ 10am PST
What: Join us to discuss the Profiler, Test Tools (Unit, Generic, Manual), Web & Load Testing, and Code Analysis (FxCop & PREFast). We have questions for you, will answer questions from you, and will chat about the exciting new technology.
Where: http://msdn.microsoft.com/chats
Posted by JoshCh | 0 Comments

A custom ValidationRule to catch redirects to error pages

*This is the second post in a series about web test extensibility points.  The first post was about extending web tests using custom IHttpBody classes.*

It is a common practice for a web application to trap errors and redirect the user to a "We're sorry, an error has occurred..." page.  Unfortunately, these error pages often return a "200 OK" status code instead of an error code in the 400 or 500 range.  This behavior is definitely something to watch out for when creating and running web load tests since it could allow web application errors to go unnoticed.

Here is a sample validation rule called ValidateResponseUrl that checks response URLs to catch redirects to error pages:

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Text;
using Microsoft.VisualStudio.TestTools.WebTesting;

namespace OcracokeSamples {
    public class ValidateResponseUrl : ValidationRule {
        private string _urlStringToFind = String.Empty;
        private bool _failIfFound = true;

        [Description("If true, validation fails if the specified string is found in the response URL.  If false, validation fails if the specified string is not found in the response URL.")]
        public bool FailIfFound {
            get { return _failIfFound; }
            set { _failIfFound = value; }
        }

        [Description("The string to search for in the response URL.  For example, enter 'Error.aspx' if you want to make sure the server does not redirect to that error page.")]
        public string UrlStringToFind {
            get { return _urlStringToFind; }
            set { _urlStringToFind = value; }
        }

        public override string RuleName {
            get { return "Validate Response URL"; }
        }

        public override string RuleDescription {
            get { return "Verifies the response URL.  This rule can be used to make sure a redirect to an error page does not occur, for example."; }
        }

        public override void Validate(object sender, ValidationEventArgs e) {
            //make sure the string to find has a value
            if (String.IsNullOrEmpty(_urlStringToFind)) {
                throw new ArgumentException("The UrlStringToFind property cannot be null or empty string.");
            }

            bool found = e.Response.ResponseUri.OriginalString.IndexOf(_urlStringToFind, StringComparison.OrdinalIgnoreCase) != -1;
            string foundMessage = found ? "was" : "was not";

            //set the result message that will appear in the web test viewer details tab and the load test error table
            e.Message = String.Format("The string '{0}' {1} found in the response URL.", _urlStringToFind, foundMessage);

            //set whether the validation passed or failed
            e.IsValid = found != _failIfFound;
        }
    }
}

As you can see, it doesn't take much code to implement this very useful validation rule.  I just added a new class to my test project, made it inherit from ValidationRule, and implemented the RuleName and Validate members.  The RuleDescription property and the [Description] attributes make custom validation rules easier to use by providing help text that shows up in the Add Validation Rule dialog (shown below) and the Properties window, but they are completely optional.

Looking at the Validate method, you might be scratching your head wondering how WebTestResponse.ResponseUri can be different than WebTestRequest.Url.  The answer is that when a request's FollowRedirects property is set to True, validation and extraction rules are automatically moved from the original request to the subsequent redirect request.

To use a custom validation rule in a web test, you must first let your test project know it exists.  If the rule class is included in the test project, just build the project and the rule will show up in the Add Validation Rule dialog.  If, however, the rule is part of a separate class library project and potentially shared by multiple test projects, just build that class library project and add a reference to it in your test project(s).  Now when you right-click a request and select Add Validation Rule, you should see your rule in the Add Validation Rule dialog shown below.

Add Validation Rule dialog

You can see in this case that I'm going to cause the request to fail validation if the server redirects to error.aspx.  If you have multiple error pages, you would set the FailIfFound property to False and enter the correct response URL in the UrlStringToFind property so validation will fail if the string is not found in the response URL.

Please let me know if you have any questions or comments about this validation rule or custom validation rules in general.  I'm also open to ideas for what the next web test extensibility point example should be.

Posted by JoshCh | 7 Comments

VSTS RTM on MSDN

Who isn't excited about that string of acronyms?  I know I am.

The final version of Visual Studio Team System is now available for download by MSDN subscribers in advance of the November 7th launch date.  If you get the Team Suite or Team Edition for Software Testers (also called Team Test), feel free to post questions and comments to our MSDN support forum or right here on my blog.  The development and test teams for web and load testing are focusing on writing MSDN content and doing community support at the moment, so this is a great time to get your questions answered.

Posted by JoshCh | 1 Comments

Web and Load Testing Webcast (TODAY!)

MSDN Webcast: Load and Web Testing with Microsoft Visual Studio 2005 Team System (Level 200)    

Start Time:   Tuesday, October 25, 2005 1:00 PM (GMT-08:00) Pacific Time (US & Canada) 
End Time:   Tuesday, October 25, 2005 2:00 PM (GMT-08:00) Pacific Time (US & Canada) 
 
Products: Visual Studio.

Recommended Audience: Developer.

Language: English-American
 
Description: By using Microsoft Visual Studio 2005 Team System as a platform, you can better manage the software development life cycle. You have the flexibility to customize and extend this platform to meet organizational needs. In this webcast, gain a general understanding of the Web and load testing features in Visual Studio 2005.

Presenter: Ed Glas, Group Manager, Microsoft Corporation

Posted by JoshCh | 3 Comments

Meet up at ASP.NET Connections

I'll be in Las Vegas attending ASP.Net Connections from November 7th through 10th.  I'm mainly going so I can learn more about new and upcoming ASP.NET features that we want to make sure we cover in online samples and/or support in our next release.  I would also like to meet up with any and all VSTS web load testers (i.e. you) that are going to be there.  Leave a comment or send me an email if you're interesting in getting together with me and other web load testers to share feedback, learn more about web tests, swap tips and tricks, etc.  I look forward to seeing you there.

Posted by JoshCh | 0 Comments

Creating custom IHttpBody classes for coded web tests

If you've recorded a web test and generated code, you've probably noticed the FormPostHttpBody class.  You might have even seen the StringHttpBody class if you had web service requests in your web test.  These are the only two built-in classes for generating HTTP request bodies, so what do you do if you need to send requests containing something other than form parameters and strings?  You just implement your own IHttpBody class.

I'm going to paste in a sample IHttpBody called BinaryHttpBody at the end of this post.  This class can be used in coded web tests to send requests with specific binary data in the request body.  You might use this if you want to send binary data in a PUT request body, for example.  Another use would be if you wanted to load pre-generated request bodies from files on disk.  As you can see, the constructor for BinaryHttpBody accepts either a byte array or a Stream.

To send an array of bytes, you would use BinaryHttpBody like this:

WebTestRequest request1 = new WebTestRequest("http://localhost/test.aspx");
request1.Method = "POST";
BinaryHttpBody binaryBody = new BinaryHttpBody("application/octet-stream", 
        new byte[] { 0x01, 0x02, 0x03 });
request1.Body = binaryBody;
yield return request1;

The following code for BinaryHttpBody should be compatible with the July CTP and later.

using System;
using System.Collections.Generic;
using System.IO;
using System.Text;
using Microsoft.VisualStudio.TestTools.WebTesting;

namespace TestProject1 {
    public class BinaryHttpBody : IHttpBody {
        private string _contentType;
        private Stream _stream;

        public BinaryHttpBody(string contentType, byte[] bytes)
            : this(contentType, new MemoryStream(bytes, false)) {

        }

        public BinaryHttpBody(string contentType, Stream stream) {
            _contentType = contentType;
            _stream = stream;
        }

        public BinaryHttpBody() {

        }

        public string ContentType {
            get { return _contentType; }
            set { _contentType = value; }
        }

        public Stream Stream {
            get { return _stream; }
            set { _stream = value; }
        }

        public void WriteHttpBody(WebTestRequest request, System.IO.Stream bodyStream) {
            if (_stream != null && _stream.CanRead) {
                try {
                    byte[] buffer = new byte[8192];
                    int bytesRead;
                    while ((bytesRead = _stream.Read(buffer, 0, buffer.Length)) > 0) {
                        bodyStream.Write(buffer, 0, bytesRead);
                    }
                } finally {
                    _stream.Close();
                }
            }
        }

        public object Clone() {
            throw new NotImplementedException();
        }
    }
}

This is just one example of an extensibility point web tests provide. I plan to demonstrate custom validation and extraction rules next.

Posted by JoshCh | 3 Comments
More Posts Next page »
 
Page view tracker