Monday, 22 September 2014

Testing Search

I got thinking about Search testing the other day and ended up waking at 4am to scribble a mind-map of thoughts before I lost them. As you do.

The main thought was how Search testing traverses the three main layers of testing we typically consider.

* Where - the web front end in which the user builds their search queries
* How - the Search 'engine' that does the work of polling the available data

* What - the data that is being searched

Target of the search
Before any testing commences, we need to understand as fully as possible what it is that can be searched for. To do that we need to be clear on the data that is the target for the search. Though reasonable to expect the searched for data to be static in a database, the returned results might also be dynamically generated by the search query. Not all data returned may be under the control of the business, remember the search may also be fed by external data.

- What data is the user searching for? (e.g. products, account records, flights, ...)
- What data sets are available? (e.g. product attributes, transactions/payment types, current/future flights, ...)
- What is the source of the data? (e.g. static data, dynamically created data, external sources, ...)

The Search Engine
The most informative way of learning this is via the API documentation for the search engine. With this we'll know what can be passed to the engine and so shape the scope and structure of allowable queries. Some good public examples of Search APIs are those for Google ( and Twitter (

If your development team can't provide the API docs, ask for access to the Java docs, Unit Tests or whatever else can inform you about the implementation specifics. In my experience, the design of search is rarely sufficiently detailed enough for testers, in specification documents or requirements statements. Indeed in a more agilistic setting it's very likely this detail is closer to the code - get sauce! You can't properly test the implementation from just the original specification.

To counter the obvious challenge here of 'you don't need to worry about that... just run your tests... confirm the requirements have been met... ', as part of the implementation team you are not restricted to just looking at the acceptance tests or requirements. As part of the implementation team, in the tester role, you are working with equality with all other roles and so the code, unit tests, etc. are not 'off limits', just as your tests and automation code is not off-limits to anyone. If you are told they are, then you are not in an agile team or whoever is saying this needs to get off the agile team. (Whether you can understand the code, tests, etc. or not is another matter)

Building Queries
Our next concern close to the user, is the UI and how it allows the search queries to be crafted.

- How can a search query be entered? (free text, drop-down, ...)
- How can search options be used? (Boolean, switches, ...)

Now that we have knowledge of what the search functionality actually is, as a component of the system, let's think about what testing will be needed.

We clearly have out typical functional testing, that will include submitting search queries. However, we need to break that down too, to ensure we're clear about what we're actually testing.

As we have a UI, we'll need to test the functionality a user is provided to build a search query. This might be a simple free-text field like Google, where the user just enters whatever text they want with no switched, drop-downs or options. Be aware this can have hidden nuances too though. At first glance Google search functionality is just a text field, but in fact we have a bunch of ways to structure our search query.

For example, you can enter 'define: testing' to get a dictionary definition or search for files in a given directory, try '-inurl:htm -inurl:html intitle:"index of" + ("/secret")' to see how not to hide your password files and pictures of your ex girlfriend. Don't do that search in work by the way! If you've reviewed the API docs or something similar you should know if the above types of searches are available to you.

For searches constructed by selecting from drop-down boxes, using radio buttons, etc. it'll be more apparent what choices you have. Again, be careful to understand where the data in drop-downs for example is coming from. As always, view source. Is that drop-down populated via an Ajax call, a fixed list in the HTML or a list from another JavaScript? Depending on how those selectable search options can be chosen, it will affect the specific test cases that will be possible. Remember to do some equivalence partitioning where any lists are concerned, like other tests it's highly unlikely you'll need to test all combinations.

Obvious initial tests will be data entry, using valid and invalid inputs, leading spaces, special characters and all the other standard cheat-sheet heuristics. However, we need to be careful here as this is more likely form-field input validation, which is not search testing. Be sure again to view source and see where that validation is taking place, client or server side. Hopefully it's not an embedded JavaScript, check source and if you see a src=validation sounding JavaScript name, save a local copy and inspect it. Oh wait, you don't need to do that because other members of your implementation team have shared these items via your CVS / Git / etc. and you can review them.

We've all experienced using a search engine and getting a result that is nothing like what we were after. The underlying challenge here is a code and engineering problem, but it's part of our job to show how inaccurate results are. The definition of this will likely be a combination of referring to requirements and our experience / gut-feel. When we use search engines ourselves, we'll often get results that are technically correct and yet wrong. It'll be a bunch of blog posts on a given topic where half are barely relevant or a search for 'coco' that uncovers a family (un)friendly set of pictures of a lady looking rather 'distorted'. 

When we perform a search it's reasonable to expect the results will be the same if the underlying data and search engine logic are the same. This will form the basis for some of the regression testing we'll want to conduct over successive releases.
However, there are occasions when the same search string will bring back different results.

* The search database is replicated and data varies between the databases
* Data has changed since we last run the query

This should be noticed due to a problem with search result consistency. Either the search string will be consistently different than was returned in a previous test run or it will sometimes be different. Where data is different consistently, we just need to validate this is as expected then update our script. For results that should be the same but are sometimes as expected and at other times not, we need to look at where the query is going. A common problem for search consistency is the combination of data replication across multiple data-bases and routing due to load balancing. When we conduct our search, we might not be 100% certain as to what data sets we are hitting and which server our query is hitting. These are questions to take up with whoever is fulfilling the Dev Ops / Infrastructure role in the team. We need to understand the data replication process, are all servers copied to at the same time or in some kind of order? Is there a back-up process that takes servers off-line when we might be testing?

For all of the above we could be using a simple tool such as Selenium with a dashboard to show results. Selenium allows us to run the tests in a loop, vary the query, save down results files, etc. What i would not use it for, although I know it does get used for this, is performance.

Another aspect of search is the speed at which we get back our search results. We'd expect this to vary but not by much, usual changes in network traffic and resources on machines are fine to a degree. When we start to see notable slowdowns then we need to investigate. To help with testing of performance use a performance testing tool, as above, Selenium ain't it. Grab a copy of JMeter if you're working with open source tools. It should be an easy matter to replicate your test in JMeter and build them out into a test plan that let's your performance test search results.

In closing
In this post we've looked at the fact search testing is not just putting a few search strings into a text field and reviewing the results don't look wrong. We have the full scope of functional and performance, along with data accuracy and consistency to consider. To test thoroughly we need sight of the requirements but also the code or API docs and an understanding of the network infrastructure that's in place.


Testing search functionality on websites and applications