Note that there are some explanatory texts on larger screens.

plurals
  1. PODetermine which fields on a page are required?
    primarykey
    data
    text
    <p>I'm working as a UI tester at a small software company. In order to make my life easier, I'm trying to write a scraper in Python that will automatically generate some of the standard tests run on every page. Testing is done in use Quicktest Pro and needs to be written in VBScript. Every page that creates data needs to have a full case, where every field on the page gets filled out, and number of reduced cases, where only required fields get filled out.</p> <p>The full case should be easy -- I plan to set up a requests.Session object with an already-authenticated cookie, send a GET request to the appropriate page, and parse the the response with BeautifulSoup.</p> <p>The reduced cases I'm less sure of how to approach. I can think of three ways to go about it, but none of them sound great:</p> <p>A) Try to submit a blank page. Check the response for error messages of the form "* <code>&lt;field&gt;</code> is a required field." Look for the fields whose names are closest to the one specified. Fill them out. Try to submit again, and repeat, adding fields until it goes through successfully, and return a list of fields.</p> <p>This isn't great because it's difficult to identify what field the error message corresponds to. A message stating that "* Birth date is required" might actually be referring to a form element with an HTML ID of "dob_entry1." I'm also testing on a development copy of the source, so it's not unusual for partially filled out forms to cause a server error, and I'd probably need to manually clean up any data that this approach creates.</p> <p>B) Send in a fully filled-out form. Find the database record(s) that just got created, and find out which columns are NOT NULL. Match column names to field names, and return the resulting list.</p> <p>This seems more promising, but I'm not sure how to go about finding the records that were created. Logs (except for errors) are not turned on for the MySQL server, and the server has ~15 databases on it, all of which are being worked on by developers, so I can't mess with the server's global variables to turn it on. I could query the database for all of the values that I just passed in, but there's a pretty huge amount of data already on the db, so it's unlikely that I would be able, for example, to figure out which date of birth is the one that I just submitted. </p> <p>Googling, tools like this <a href="http://hackmysql.com/mysqlsniffer" rel="nofollow">http://hackmysql.com/mysqlsniffer</a> might be an option, but I'm wary of doing anything to the server as a whole since the developers will be using other dbs on the server at the same time. I don't have much experience with SQL so I'm not very sure how to go about doing this.</p> <p>C) Somehow parse the C# source code to find the query that corresponds to a given page. Find out which columns it affects, query the database to find out which are NOT NULL, match the column names to field names and return a list.</p> <p>I have no experience with C# so I don't know how feasible this is, but if it were PHP I think it would be pretty simple. I could find the source for the site if I poked around but I haven't looked at any of it yet. The website is ~10 years old and is pretty massive, so matching page names to source files is probably non-trivial. </p> <p>I imagined that finding out which fields of a form are required to submit a page would be a pretty common task for scrapers, but Google hasn't turned up much. Are any of these approaches reasonable? Is there an easy solution that I'm missing out on?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload