Category Archives: SQL Basics

Sql Indexes

Indexes are vital for good database performance. Pick the correct index and your queries will be fast and responsive. Pick the wrong ones, and you’ll waste a lot of time and resources.

An index on a table is simply a way of looking up the data contained within the table. It should be as small as possible whilst still being useful.

For example, you might have a table that contains information about people. Your primary key might be a simple ID number, and you’d also store things like Forename, Surname, Date of Birth etc.

ID  Forename  Surname  DateOfBirth
1   John      Smith    1990-01-01
2   James     Apple    1980-12-10
3   Phil      Barns    1975-06-06
4   Mary      Stokes   1987-09-23

It would be very common to query this table by searching on Surname.

SELECT
  Forename, Surname, DateOfBirth
FROM
  People
WHERE
  Surname LIKE 'A%'

If you turn on the “Include Actual Execution Plan” with this query in Sql Server Management Studio, you will see a “Table Scan”. This occurs when every record in the table must be search through to find matches. In a large table this would be very slow.

You could add a simple Index the table to speed the query up massively

CREATE INDEX IX_Surname ON People (Surname)

When this index has been created, the query optimiser will look-up the values in the index, and only then get the data from the table for the rows that match, a much faster operation.

If you want to find out the Indexes that SQL Server believes you need, then you can use the dynamic maangement view sys.dm_db_missing_index_details

Don’t rely on this view to be comprehensive, just use it as a guideline to begin investigation. Sometimes it’s a bit enthusiastic, whilst the indexes may be useful, the performance trade-off may not be ideal.

This is only a very basic introduction to indexes, I’ll be going over them in more detail in further posts. If there’s anything you want to see covered sooner rather than later, let me know and I’ll see what I can do.

Advertisements

Sql Standard Joins

When you need to get sensible data out of a well designed database, you will often need to use a JOIN condition.

JOINs will join together two (or more) tables, based on the columns that you select. The two most common types of join are the INNER JOIN and LEFT OUTER JOIN.

An inner join will return rows from the database that match on both tables referenced in the join. A left outer join will return all rows from the first referenced table (the left), and include matching data from the second table.

As a simple example, imagine we had a table containing the details of some Invoices, and a table with details of the Line Items on each invoice.

In a realistic schema we’d have further tables for Customers, Items and so on, this is kept deliberately simple for illustration purposes.


InvoiceID   InvoiceDate   Customer

1           2013-05-01    John Smith

2           2013-05-03    Mike Philpott


LineItemID   InvoiceID   Item       ItemPrice   Quantity

1            1           No6 Bolt   0.54        20

2            1           No7 Screw  0.31        10

With these two very simple tables, we can see the syntax for both types of common join, and also what records would be returned.


SELECT
    InvoiceDate, Customer, Item, ItemPrice, Quantity
FROM
    Invoices
INNER JOIN
    LineItems
ON
    Invoices.InvoiceID = LineItems.InvoiceID

This will return only the two rows for John Smith’s order, Mike Philpott’s will be excluded, as it has no line items


SELECT
    InvoiceDate, Customer, Item, ItemPrice, Quantity
FROM
    Invoices
LEFT OUTER JOIN
    LineItems
ON
    Invoices.InvoiceID = LineItems.InvoiceID

In this case, we will see three rows, two for John Smith, with full items details. The third will show Mike Philpott’s order. It will contain values for InvoiceDate and Customer, but will have NULL in Item, ItemPrice and Quantity, as there are no matching rows in the table that is used.

We can expand the joins to consider further tables, by using additional blocks of JOIN and ON. Be careful that when you do this your joins are ordered correctly, so that the appropriate tables are considered at the right time. If the order is incorrect, an INNER JOIN may be applied to the results of a LEFT OUTER JOIN, when you intended the INNER JOIN to be applied first.

In summary, we should use INNER JOIN when we only want to know about the rows that have matches in the linked table, and the LEFT OUTER JOIN when we want to retain all rows in the first table.