I recently was trying to troubleshoot some performance issue in the search function of a Sitecore 7.5 site I was working on. The search results were served via an MVC controller action that returned JSON to the front end code. We were seeing very large request times, and one of the culprits was thought to be item access being performed after we received results back from Coveo. (there is a need to perform a lot of distance calculation on the results returned to Coveo, and we’ve yet to find an elegant way to offload this to the Coveo servers)
To verify this theory, I ran some tests with a performance profiler attached to the w3wp process. The results of the profiler pointed to the LINQ processing done by Coveo, and expanding the most expensive call stacks, the bulk of the time was spend in the XMLSerializer. Which I thought was odd.
After a bit more digging, the Coveo ContentSearch provider for Sitecore uses Coveo’s SOAP interface under the covers. As such, since our queries were returning large result sets, there was a lot of XML to shuttle through all of the SOAP proxy code. Turns out, Coveo addresses this in their documentation regarding optimizing LINQ performance.
Of the two options, optimizing the fields returned in a query via the
CoveoQureyFieldPipeline is ultimately what we chose. This pipeline allows you to specify which fields are returned during LINQ to Sitecore calls to the index. This settings is global, so you’ll need to specify all of the fields that all of your LINQ queries need. What is sort of unfortunate is if calling the SOAP interface directly, you can specify the fields on a per query basis.
Ultimately, if we can move some more of our distance math to the Coveo servers, we can add
Take() calls to the LINQ query to further reduce the amount of data serialized in the SOAP call, but given the nature of the data in this particular situation, I don’t know that will be possible.