Skip to main content
We have a large trade item table with 2.125.590 records. This table is visible in the application and uses a top 1000 limit (configured via the SF).

A user is currently allowed to set a sorting order on 8 different columns.



When a user applies a sort order on 1 column it takes up to 16 seconds for the application to show the sorted set. And Excel style filtering can't handle it all. Trying to place a filter via the Excel style filtering on a column header takes so long that I have to force close the application via the Windows Task Manager.



What's the best practice for handling these performance issues? I could create a index for every possible sort column (8 in total). But what if a user wants to sort on multiple columns at once? Then the created index no longer covers the query.
If disabling column sorting for some of the columns is not an option then yes, creating an index for every column would definately help. Although these indexes won't cover sorting on multiple columns at once they will speed up retrieving the top 1000 records for the first sorted column.
We still want to allow sorting by users. So that's not an option.

If we would create an index for every possible sort column in the table won't that slow down insert, update and delete actions? Because it means that every index needs to be updated after manipulating a record.



In this case we have a valid argument to create the indexes for sorting because the table 'trade_item' will most of the time be used to look up information and not manipulate them.
Yes, having many indexes on a table will cause insert, update and delete actions to perform worse. It's difficult to say how much worse, but in general the benefits for select actions outweigh the costs for insert, update and delete actions, especially when the table is mostly used for reading.



To add to the solution regarding indexes, it is also worth considering to add prefilters for common scenarios or to create various master-detail scenarios to view and constrain the data from different perspectives (e.g. "trade items by period", "trade items by category", "trade items by buyer/seller/vendor/producer"). Constraining the data set is often an effective way to solve poor sorting performance.
As @Vincent Doppenberg suggests, it is probably best to logically partition the data. I can't imagine these 2.125.590 records do not belong to some kind of parent.



Another option is to provide a task that handles the filtering via dynamic SQL and storing user context. Write a query that collects the data based on the filter input and write this data to a seperate table and include user context (i.e. username). After the task has run you can take the user to the result (filtered on his user-context) so the user can see the data he was after.



There are a few risks to this approach:


  • you are duplicating data, risking it getting out of date
  • you need to remove the data from the previous search
  • you need to provide a means of maintenance of this data set in order to clean it up
  • using dynamic SQL always use sp_executesql and parameterise your user input in order to prevent SQL injection (i.e. looking for 's-Hertogenbosch)

I am also wondering if columnstore indexes could help in this scenario?

Reply