Tuesday, November 20, 2007

Automated Form Submission Using axWebBrowser With No Id Tags


It's fairly simple to manipulate the DOM of a web page in the Internet Explorer ActiveX control when the tags are nicely formatted with Id tags. You can just stuff the element into IHTMLElement and do Document->getElementById("Id"). You can then diddle with the button, field or whatever other HTML element you like to your heart's content. But what about those fields and buttons the page creator has neglected to give a unique identifier? The only answer is to put the the element, or group of elements if there are others that match this criteria, into an IHTMLElementCollection. This makes it rather more difficult to access the individual form element you need. There are a few examples of how this is done, but the only ones I found easily weren't in C++. I took the logic from a Visual Basic example and translated it to this:

void FormSubmit(){

// This gets all of the html input elements on the page. Buttons and textboxes.

// Must be HTMLDocument3 or it fails
       
IHTMLDocument3^ doc=(IHTMLDocument3^)axWebBrowser1->Document;
        IHTMLElementCollection^ docelems=doc->getElementsByTagName("input");

// This is for a textarea control that doesn't fall under the input category
        IHTMLDocument3^ doc2=(IHTMLDocument3^)axWebBrowser1->Document;
        IHTMLElementCollection^ docelems2=doc2->getElementsByTagName("textarea");

// This runs through each element in the collection to see if it contains a particular string that

// you've identified as unique to the set of input or textarea elements gathered by each

// collection.

        for each (IHTMLElement^ inputElement in docelems){

// This part is of particular interest as it activates or checks a checkbox by changing the entire

// element. Other ways are more difficult to implement with this particular type for some

// reason.
            if (inputElement->outerHTML->Contains("Checkbox1")){
                inputElement->outerHTML="<input type=\"checkbox\" name=\"Checkbox1\" checked/>";
            }
            else if (inputElement->outerHTML->Contains("recipient")){
                inputElement->innerText=windowsTextbox->Text;
            }
            else if (inputElement->outerHTML->Contains("sender")){
                inputElement->innerText="Anonymous";
            }
        }
        for each (IHTMLElement^ inputElement in docelems2){
            if (inputElement->outerHTML->Contains("text")){
                inputElement->innerText="This is not a test.";             }
        }
        for each (IHTMLElement^ inputElement in docelems){
            if (inputElement->outerHTML->Contains("SendButton")){
                inputElement->click();
            }
        }
}

This could be implemented with only one collection instance by using
IHTMLElementCollection->All instead of the getelementsbytagname method. Using this method, however, can reduce the number of items in each collection and minimizes the chance of false positives when dealing with multiple element types.

0 Comments: