HTML5 / Web API - Displaying Large Tables

Introduction

Tables are an integral part of most web-based applications and enterprise systems however, few are implemented to handle large amounts of data. I'm going to demonstrate an alternative UI pattern that can handle reasonably large datasets in cases where SQL level pagination is not possible.

The most basic pattern consists of reading all required data from a database in a single "SELECT * WHERE ... ORDER BY ...", then creating a list of objects from this dataset, and passing this list to the UI for display. Pagination is then applied at the UI level. Note that typically not ALL columns are returned but certainly enough with which to build a complete business object or DTO. At the opposite end of the scale lies the ideal pattern whereby the application again executes "SELECT * ..." but this time only for the page rows visible on the UI. In this case, pagination occurs at the database level with SQL specific commands such as "LIMIT", " OFFSET", and "FETCH NEXT".


Image 1. Common Pagination Patterns

A Middle Ground

An alternative I'd like to present lies somewhere between these two patterns and exhibits usable performance without database level pagination. This makes it suitable for situations where you may not be able to use database specific commands. Perhaps an uncooperative ORM limits your database access or you are unable to alter a legacy data layer.

The pattern is very straight forward and relies on executing the usual initial database query with filtering and sorting applied, but rather than returning all required columns, a single "ID" column is returned. The UI receives the complete "ID" list but then only requests full row data for the page currently displayed. Pagination, therefore, occurs at the UI level. This essentially breaks the process into two phases. The first "SELECT id FROM ... WHERE ... ORDER BY ..." establishes all relevant rows and the second phase iterates through only those rows visible and executes a "SELECT * WHERE id=?" for each row where each query is mapped to a complete object. This simple modification is worth considering because even for a million records, returning a single column can be far more practical in terms of query performance and network bandwidth.

Most people don't expect there to be much difference between selecting one primary key column and selecting all/many columns so you might like to test this yourself first. My simple tests consisted of a million row table with ten columns including a single integer key. The key-only query "SELECT id FROM person ORDER BY id ASC" took about 33 seconds whilst the all-data query "SELECT * FROM person ORDER BY id ASC" took almost three minutes. A query with proper pagination "SELECT * FROM person ORDER BY id ASC OFFSET 400000 ROWS FETCH NEXT 100 ROWS ONLY" took 13 seconds for every page. Note that these figures are for my very old and slow hand-me-down laptop.


Image 2. Alternative Pagination Pattern

An Example Application

Below I'll be describing a browser-based single page example using C# and the newish Web API. The database consists of a single table using MS SQL Server Express. MVC is, thankfully, on the way out and RPC-style Client/Server approaches are making a comeback. This is a big win for proper object-oriented design as MVC was "a dog's breakfast", but I digress.

UI

Listing 1 includes all the HTML for the UI and displays a single 25 row results table (Lines 94-110) with a "next" and "previous" button for paging (Lines 106-107). Additionally, a "search" field and button is displayed above the table (Line 90). No fancy formatting has been applied for simplicity's sake.

The 2-phase aspect of this implementation can be seen with the initial query being executed when the "Go" search button is clicked (Lines 61-66) and the following per-row queries being executed when the current page is rendered (Lines 41-59).
1:  <!DOCTYPE html>  
2:  <html>  
3:    
4:   <head>  
5:    <title>Large Dataset Example</title>  
6:     <meta charset="utf-8" />  
7:    <script src="jquery.js"></script>  
8:    <script>  
9:     var person_ids, page_start=0, page_size=25, start_time;  
10:    
11:     function OnBindTable(ids)  
12:     {  
13:      var elapsed_time, now;  
14:    
15:      now = new Date();  
16:      elapsed_time = (now.getTime() - start_time.getTime()) / 1000;  
17:      $("#status").text("Retrieving person data. (" + elapsed_time + "s)");  
18:    
19:      person_ids=ids;  
20:      page_start=0;  
21:      OnBindPage(ids, page_start, page_size);  
22:     }  
23:    
24:     function OnBindPage(ids, page_start, page_size)  
25:     {  
26:      var c, row;  
27:    
28:      if (ids != null && ids.length > 0)  
29:      {  
30:       $("#table_body").empty();  
31:       for (c = page_start; c < ids.length && c < page_start + page_size; c++)  
32:       {  
33:        row = $("<tr>");  
34:        $("#table_body").append(row);  
35:    
36:        OnBindRow(ids[c], c+1, row);  
37:       }  
38:      }  
39:     }  
40:    
41:     function OnBindRow(person_id, row_no, row)  
42:     {  
43:      $.getJSON("api/Person/GetPerson", { id: person_id }, function (person)  
44:      {  
45:       if (person != null)  
46:       {  
47:        row.append($("<td>").text(row_no));  
48:        row.append($("<td>").text(person.name));  
49:        row.append($("<td>").text(person.birthday));  
50:        row.append($("<td>").text(person.height));  
51:        row.append($("<td>").text(person.weight));  
52:        row.append($("<td>").text(person.address));  
53:        row.append($("<td>").text(person.phone));  
54:        row.append($("<td>").text(person.email));  
55:        row.append($("<td>").text(person.age));  
56:        row.append($("<td>").text(person.savings));  
57:       }  
58:      });  
59:     }  
60:    
61:     function OnClickGo()  
62:     {  
63:      $("#status").text("Working! Please wait.");  
64:      start_time = new Date();  
65:      $.getJSON("api/Person/GetIds", { name_filter: $("#name_filter").val() }, OnBindTable);  
66:     }  
67:    
68:     function OnClickNext()  
69:     {  
70:      if (person_ids != null && person_ids.length > 0 && page_start + page_size < person_ids.length)  
71:      {  
72:       page_start += page_size;  
73:       OnBindPage(person_ids, page_start, page_size);  
74:      }  
75:     }  
76:    
77:     function OnClickPrevious()  
78:     {  
79:      if (person_ids != null && person_ids.length > 0 && page_start - page_size >= 0)  
80:      {  
81:       page_start -= page_size;  
82:       OnBindPage(person_ids, page_start, page_size);  
83:      }  
84:     }  
85:     </script>  
86:    </head>  
87:    
88:   <body>  
89:    <div>  
90:     Search <input id="name_filter" type="text" /><button onclick="OnClickGo()">Go!</button>  
91:     <span id="status"></span>  
92:    </div>  
93:    
94:    <table border="1">  
95:     <thead>  
96:      <tr>  
97:       <th>No.</th><th>Name</th><th>Birthday</th><th>Height</th><th>Weight</th>  
98:       <th>Address</th><th>Phone</th><th>E-Mail</th><th>Age</th><th>Savings</th>  
99:       </tr>  
100:      </thead>  
101:     <tbody id="table_body">  
102:      <tr><td colspan="10">No data to display. Click "Go" to start.</td></tr>  
103:     </tbody>  
104:     <tfoot>  
105:      <tr><td colspan="10">  
106:       <button onclick="OnClickPrevious()">Previous</button>  
107:       <button onclick="OnClickNext()">Next</button>  
108:       </td></tr>  
109:     </tfoot>  
110:    </table>  
111:    
112:    </body>  
113:    
114:   </html>  
115:    

Business Logic

The business logic consists of a single Web API controller with the two required functions to return a list of Ids and return a single object given a particular Id. No ORMs such as the Entity Framework were used so as to not obfuscate the database calls. The functions are fairly standard so I won't go into any further details.

1:    
2:  namespace LargeData_App.Controllers  
3:  {  
4:   public class PersonController : System.Web.Http.ApiController  
5:   {  
6:    public string conn_str = "Data Source=(LocalDB)\\MSSQLLocalDB;AttachDbFilename=D:\\data\\code\\LargeData_Solution\\LargeData_App\\App_Data\\db.mdf;Integrated Security=True";  
7:    
8:    [System.Web.Http.Route("api/person/GetPerson")]  
9:    public Person GetPerson(int id)  
10:    {  
11:     System.Data.SqlClient.SqlConnection db;  
12:     System.Data.SqlClient.SqlCommand query;  
13:     System.Data.SqlClient.SqlDataReader res;  
14:     Person person=null;  
15:    
16:     db = new System.Data.SqlClient.SqlConnection(conn_str);  
17:     if (db != null)  
18:     {  
19:      db.Open();  
20:      if (db.State == System.Data.ConnectionState.Open)  
21:      {  
22:       query = db.CreateCommand();  
23:       query.CommandText = "select * from Person where id=@1";  
24:       query.Parameters.Add(new System.Data.SqlClient.SqlParameter("1", id));  
25:    
26:       res = query.ExecuteReader();  
27:       if (res != null)  
28:       {  
29:        if (res.Read())  
30:        {  
31:         person = new Person();  
32:         person.id = (int)res.GetValue(res.GetOrdinal("id"));  
33:         person.name = (string)res.GetValue(res.GetOrdinal("name"));  
34:         person.birthday = (System.DateTime)res.GetValue(res.GetOrdinal("birthday"));  
35:         person.height = (double)res.GetValue(res.GetOrdinal("height"));  
36:         person.weight = (double)res.GetValue(res.GetOrdinal("weight"));  
37:         person.address = (string)res.GetValue(res.GetOrdinal("address"));  
38:         person.phone = (string)res.GetValue(res.GetOrdinal("phone"));  
39:         person.email = (string)res.GetValue(res.GetOrdinal("email"));  
40:         person.age = (int)res.GetValue(res.GetOrdinal("age"));  
41:         person.savings = (decimal)res.GetValue(res.GetOrdinal("savings"));  
42:        }  
43:        res.Close();  
44:       }  
45:    
46:       db.Close();  
47:      }  
48:     }  
49:    
50:     return person;  
51:    }  
52:    
53:    [System.Web.Http.Route("api/person/GetIds")]  
54:    public int[] GetIds(string name_filter)  
55:    {  
56:     System.Data.SqlClient.SqlConnection db;  
57:     System.Data.SqlClient.SqlCommand query;  
58:     System.Data.SqlClient.SqlDataReader res;  
59:     System.Collections.Generic.List<int> res_list=null;  
60:     int[] res_array = null;  
61:    
62:     db = new System.Data.SqlClient.SqlConnection(conn_str);  
63:     if (db != null)  
64:     {  
65:      db.Open();  
66:      if (db.State == System.Data.ConnectionState.Open)  
67:      {  
68:       query = db.CreateCommand();  
69:       if (string.IsNullOrEmpty(name_filter))  
70:        query.CommandText = "select id from Person order by name";  
71:       else  
72:       {  
73:        query.CommandText = "select id from Person where name like @1 order by name";  
74:        query.Parameters.Add(new System.Data.SqlClient.SqlParameter("1", "%"+name_filter+"%"));  
75:       }  
76:    
77:       res = query.ExecuteReader();  
78:       if (res != null)  
79:       {  
80:        res_list = new System.Collections.Generic.List<int>();  
81:        while (res.Read())  
82:        {  
83:         res_list.Add((int)res.GetValue(0));  
84:        }  
85:        res.Close();  
86:       }  
87:    
88:       db.Close();  
89:      }  
90:     }  
91:    
92:     if (res_list != null && res_list.Count > 0)  
93:      res_array = res_list.ToArray();  
94:    
95:     return res_array;  
96:    }  
97:   }  
98:  }  
99:    

Database

The database consists of a single table named "Person" with ten fields of associated data. This was a purely arbitrary choice and the rows consisted of mostly random data generated by some C# code which has not been included here. Something that should be noted is that the "id" field is the table's primary key and that an additional index was created to assist with the filtering functionality.

1:   CREATE TABLE [dbo].[Person] (  
2:     [id]       INT IDENTITY (1, 1) NOT NULL,  
3:     [name]     NCHAR (128) NULL,  
4:     [birthday] DATE NULL,  
5:     [height]   FLOAT (53) NULL,  
6:     [weight]   FLOAT (53) NULL,  
7:     [address]  NCHAR (1024) NULL,  
8:     [phone]    NCHAR (64) NULL,  
9:     [email]    NCHAR (256) NULL,  
10:    [age]      INT NULL,  
11:    [savings]  MONEY NULL,  
12:    PRIMARY KEY CLUSTERED ([id] ASC)  
13:  );  
14:    
15:    
16:  GO  
17:  CREATE NONCLUSTERED INDEX [person_idx]  
18:    ON [dbo].[Person]([name] ASC);  

Conclusions

The approach I've outlined above may seem simple to the point of being trivial but it's worthwhile keeping in mind because, from my experience, it's very often overlooked. At a previous employer the teams there were using the naive approach of extracting all required data in a single query and this was, understandably, causing performance issues when the data reached over a few thousand records since they had to traverse through various internal SOAP services and the Internet. The final solution was to compress the extracted data but this merely moved the problem to such time as the dataset grew to hundreds of thousands of records. My advise to you is, don't neglect a simple solution until you've given it a go.

Additional Resources

Popular Posts