SQL Server 2016 Archives - SQLPerformance.com

New Metadata-Only Column Changes in SQL Server 2016

Paul White — Fri, 17 Apr 2020 09:00:00 +0000

The ALTER TABLE ... ALTER COLUMN command is very powerful. You can use it to change a column’s data type, length, precision, scale, nullability, collation…and many other things besides.

It is certainly more convenient than the alternative: Creating a new table and migrating the data each time a change is necessary. Nevertheless, there is only so much that can be done to hide the underlying complexity. Along with a large number of restrictions on what is even possible with this command, there is always the question of performance.

Ultimately, tables are stored as a sequence of bytes with some metadata elsewhere in the system to describe what each of those bytes mean, and how they relate to each of the table’s various columns. When we ask SQL Server to change some aspect of a column’s definition, it needs to check that the existing data is compatible with the new definition. It also needs to determine if the current physical layout needs to change.

Depending on the type of change and the configuration of the database, an ALTER COLUMN command will need to perform one of the following actions:

Change metadata in system tables only.
Check all the existing data for compatibility, then change metadata.
Rewrite some or all of the stored data to match the new definition.

Option 1 represents the ideal case from a performance point of view. It requires only a few changes to system tables, and a minimal amount of logging. The operation will still require a restrictive schema modification Sch-M lock, but the metadata changes themselves will complete very quickly, regardless of the size of the table.

Metadata-Only Changes

There are a number of special cases to watch out for, but as a general summary, the following actions only require changes to metadata:

Going from NOT NULL to NULL for the same data type.
Increasing the maximum size of a varchar, nvarchar, or varbinary column (except to max).

Improvements in SQL Server 2016

The subject of this post is the additional changes that are enabled for metadata-only from SQL Server 2016 onward. No changes to syntax are needed, and no configuration settings need to be modified. You get these undocumented improvements for free.

The new capabilities target a subset of the fixed-length data types. The new abilities apply to row-store tables in the following circumstances:

Compression must be enabled:
- On all indexes and partitions, including the base heap or clustered index.
- Either ROW or PAGE compression.
- Indexes and partitions may use a mixture of these compression levels. The important thing is there are no uncompressed indexes or partitions.
Changing from NULL to NOT NULL is not allowed.
The following integer type changes are supported:
- smallint to integer or bigint.
- integer to bigint.
- smallmoney to money (uses integer representation internally).
The following string and binary type changes are supported:
- char(n) to char(m) or varchar(m)
- nchar(n) to nchar(m) or nvarchar(m)
- binary(n) to binary(m) or varbinary(m)
- All of the above only for n < m and m != max
- Collation changes are not allowed

These changes can be metadata-only because the underlying binary data layout does not change when Column Descriptor row format is used (hence the need for compression). Without compression, row store uses the original FixedVar representation, which cannot accommodate these fixed-length data type changes without rewriting the physical layout.

You may notice that tinyint is omitted from the integer types list. This is because it is unsigned, while the other integer types are all signed, so a metadata-only change is not possible. For example, a value of 255 can fit in one byte for tinyint, but requires two bytes in any of the signed formats. The signed formats can hold -128 to +127 in one byte when compressed.

Integer Example

One very handy application of this improvement is changing the data type of a column with the IDENTITY property.

Say we have the following heap table using row compression (page compression would also work):

DROP TABLE IF EXISTS dbo.Test;
GO
CREATE TABLE dbo.Test
(
    id integer IDENTITY NOT NULL,
    some_value integer NOT NULL
)
WITH (DATA_COMPRESSION = ROW);

Let’s add 5 million rows of data. This will be enough to make it obvious (from a performance standpoint) whether changing the column data type is a metadata-only operation or not:

WITH Numbers AS
(
    SELECT 
        n = ROW_NUMBER() OVER (ORDER BY @@SPID) 
    FROM sys.all_columns AS AC1
    CROSS JOIN sys.all_columns AS AC2
    ORDER BY n
    OFFSET 0 ROWS
    FETCH FIRST 5 * 1000 * 1000 ROWS ONLY
)
INSERT dbo.Test
    WITH (TABLOCKX)
(
    some_value
)
SELECT
    N.n
FROM Numbers AS N;

Next we will reseed the IDENTITY to make it seem like we are almost at the point of running out of values that will fit in an integer:

DBCC CHECKIDENT
(
    N'dbo.Test',
    RESEED,
    2147483646
);

We can add one more row successfully:

INSERT dbo.Test
    (some_value)
VALUES
    (123456);

But attempting to add another row:

INSERT dbo.Test
    (some_value)
VALUES
    (7890);

Results in an error message:

Msg 8115, Level 16, State 1, Line 1
Arithmetic overflow error converting IDENTITY to data type int.

We can fix that by converting the column to bigint:

ALTER TABLE dbo.Test
ALTER COLUMN id bigint NOT NULL;

Thanks to the improvements in SQL Server 2016, this command changes metadata only, and completes immediately. The previous INSERT statement (the one that threw the arithmetic overflow error) now completes successfully.

This new ability does not solve all the issues around changing the type of a column with the IDENTITY property. We will still need to drop and recreate any indexes on the column, recreate any referencing foreign keys, and so on. That is a bit outside the scope of this post (though Aaron Bertrand has written about it before). Being able to change the type as a metadata-only operation certainly doesn’t hurt. With careful planning, the other steps required can be made as efficient as possible, for example using minimally-logged or ONLINE operations.

Be Careful with Syntax

Be sure to always specify NULL or NOT NULL when changing data types with ALTER COLUMN. Say for example we wanted to also change the data type of the some_value column in our test table from integer NOT NULL to bigint NOT NULL.

When we write the command, we omit the NULL or NOT NULL qualifier:

ALTER TABLE dbo.Test
ALTER COLUMN some_value bigint;

This command completes successfully as a metadata-only change, but also removes the NOT NULL constraint. The column is now bigint NULL, which is not what we intended. This behaviour is documented, but it is easy to overlook.

We might try to fix our mistake with:

ALTER TABLE dbo.Test
ALTER COLUMN some_value bigint NOT NULL;

This is not a metadata-only change. We are not allowed to change from NULL to NOT NULL (refer back to the earlier table if you need a refresher on the conditions). SQL Server will need to check all the existing values to ensure no nulls are present. It will then physically rewrite every row of the table. As well as being slow in itself, these actions generate a great deal of transaction log, which can have knock-on effects.

As a side note, this same mistake is not possible for columns with the IDENTITY property. If we write an ALTER COLUMN statement without NULL or NOT NULL in that case, the engine helpfully assumes we meant NOT NULL because the identity property is not allowed on nullable columns. It is still a great idea not to rely on this behaviour.

Always specify NULL or NOT NULL with ALTER COLUMN.

Collation

Particular care is needed when altering a string column that has a collation not matching the default for the database.

For example, say we have a table with a case- and accent-sensitive collation (assume the database default is different):

DROP TABLE IF EXISTS dbo.Test2;
GO
CREATE TABLE dbo.Test2
(
    id integer IDENTITY NOT NULL,
    some_string char(8) COLLATE Latin1_General_100_CS_AS NOT NULL
)
WITH (DATA_COMPRESSION = ROW);

Add 5 million rows of data:

WITH Numbers AS
(
    SELECT 
        n = ROW_NUMBER() OVER (ORDER BY @@SPID) 
    FROM sys.all_columns AS AC1
    CROSS JOIN sys.all_columns AS AC2
    ORDER BY n
    OFFSET 0 ROWS
    FETCH FIRST 5 * 1000 * 1000 ROWS ONLY
)
INSERT dbo.Test2
    WITH (TABLOCKX)
(
    some_string
)
SELECT
    CONVERT(char(8), N.n) COLLATE Latin1_General_100_CS_AS
FROM Numbers AS N;

Double the length of the string column using the following command:

ALTER TABLE dbo.Test2
ALTER COLUMN some_string char(16) NOT NULL;

We remembered to specify NOT NULL, but forgot about the non-default collation. SQL Server assumes we meant to change collation to the database default (Latin1_General_CI_AS for my test database). Changing collation prevents the operation from being metadata-only, and so the operation runs for several minutes, generating heaps of log.

Recreate the table and data using the previous script, then try the ALTER COLUMN command again, but specifying the existing non-default collation as part of the command:

ALTER TABLE dbo.Test2
ALTER COLUMN some_string 
    char(16) COLLATE Latin1_General_100_CS_AS NOT NULL;

The change now completes immediately, as a metadata-only operation. As with the NULL and NOT NULL syntax, it pays to be explicit to avoid accidents. This is good advice in general, not just for ALTER COLUMN.

Compression

Please be aware that compression needs to be explicitly specified for each index, and separately for the base table if it is a heap. This is another example where using abbreviated syntax or shortcuts can prevent the desired outcome.

For example, the following table does not specify explicit compression for either the primary key or in-line index definition:

CREATE TABLE dbo.Test
(
    id integer IDENTITY NOT NULL PRIMARY KEY,
    some_value integer NOT NULL
        INDEX [IX dbo.Test some_value]
)
WITH (DATA_COMPRESSION = PAGE);

The PRIMARY KEY will have a name assigned, default to CLUSTERED ,and be PAGE compressed. The in-line index will be NONCLUSTERED and not compressed at all. This table will not be enabled for any of the new optimizations because not all indexes and partitions are compressed.

A much better, and more explicit table definition would be:

CREATE TABLE dbo.Test
(
    id integer IDENTITY NOT NULL
        CONSTRAINT [PK dbo.Test id]
        PRIMARY KEY CLUSTERED
        WITH (DATA_COMPRESSION = PAGE),
    some_value integer NOT NULL
        INDEX [IX dbo.Test some_value]
        NONCLUSTERED
        WITH (DATA_COMPRESSION = ROW)        
);

This table will qualify for the new optimizations because all indexes and partitions are compressed. As noted previously, mixing compression types is fine.

There are a variety of ways to write this CREATE TABLE statement in an explicit way, so there is an element of personal preference. The important takeaway point is to always be explicit about what you want. This applies to separate CREATE INDEX statements as well.

Extended Events and Trace Flag

There is an extended event specifically for the new metadata-only ALTER COLUMN operations supported in SQL Server 2016 onward.

The extended event is compressed_alter_column_is_md_only in the Debug channel. Its event fields are object_id, column_id, and is_md_only (true/false).

This event only indicates if an operation is metadata-only due to the new abilities of SQL Server 2016. Column alterations that were metadata-only before 2016 will show is_md_only = false despite still being metadata-only.

Other extended events useful for tracking ALTER COLUMN operations include metadata_ddl_alter_column and alter_column_event, both in the Analytic channel.

Should you need to disable the new SQL Server 2016 capabilities for any reason, undocumented global (or start-up) trace flag 3618 can be used. This trace flag is not effective when used at the session level. There is no way to specify a query-level trace flag with an ALTER COLUMN command.

Final Thoughts

Being able to change some fixed-length integer data types with a metadata-only change is a very welcome product improvement. It does require that the table is already fully compressed, but that is becoming more of a common thing anyway. This is especially true since compression was enabled in all editions starting with SQL Server 2016 Service Pack 1.

Fixed-length string type columns are probably much less common. Some of this may be due to somewhat out-of-date considerations like space usage. When compressed, fixed-length string columns do not store trailing blanks, making them just as efficient as variable-length string columns from a storage point of view. It can be annoying to trim spaces for manipulation or display, but if the data usually occupies most of the maximum length, fixed-length types can have important advantages, not least regarding memory grants for things like sorting and hashing.

It’s not all good news with compression enabled. I mentioned earlier that SQL Server can sometimes perform a metadata-only change after checking that all existing values will convert successfully to the new type. This is the case when using ALTER COLUMN to change from integer to smallint for example. Unfortunately, these operations are not currently metadata-only for compressed objects.

Acknowledgements

Special thanks to Panagiotis Antonopoulos (Principal Software Engineer) and Mirek Sztajno (Senior Program Manager) from the SQL Server product team for their assistance and guidance during the research and writing of this article.

None of the details given in this work should be regarded as official Microsoft documentation or product statements.

The post New Metadata-Only Column Changes in SQL Server 2016 appeared first on SQLPerformance.com.

Reasons to Upgrade to SQL Server 2017

Glenn Berry — Mon, 23 Apr 2018 12:16:28 +0000

Since SQL Server 2012 fell out of Mainstream Support on July 11, 2017, and SQL Server 2014 will fall out of Mainstream Support on July 9, 2019, and as both SQL Server 2008 and SQL Server 2008 R2 will be falling out of Extended Support on July 9, 2019, I have had an increasing number of clients who are planning to upgrade from these legacy versions of SQL Server to a more modern version (either 2016 or 2017).

If you are considering an upgrade to a modern version of SQL Server, should you move to SQL Server 2016 or to SQL Server 2017? My general guidance would be to strongly prefer SQL Server 2017 in most cases, unless you have a valid, tangible reason not to. For example, if you have 3^rd party software that is explicitly not supported on SQL Server 2017, that could be a blocker (although I would try talking to your software vendor about that).

So why should you prefer SQL Server 2017? Here are a number of reasons in several different categories.

New Engine Features in SQL Server 2017

Even though much of the development work in SQL Server 2017 was dedicated to making a version that runs on several different Linux distributions, there are still a number of useful new Engine features in SQL Server 2017.

One useful new feature is Automatic Plan Correction (which relies on the Query Store feature that was introduced in SQL Server 2016). This feature helps detect and automatically correct many query plan stability issues. My colleague, Erin Stellato, has written about Automatic Plan Correction and how it works in SQL Server 2017 Enterprise Edition and Azure SQL Database in much more detail.

Another new feature is Adaptive Query Processing (AQP) (which is enabled with compatibility mode 140), which is very helpful for batch mode operations used with Columnstore indexes. AQP has three components, which include Batch Mode Adaptive Memory Grant Feedback, Batch Mode Adaptive Joins, and Interleaved Execution for Multi-Statement Table Valued Functions.

There are also a number of diagnostic and troubleshooting improvements that make the life of a DBA much easier! These include multiple Showplan enhancements that are very useful for query tuning, along with several new DMVs that are useful for diagnostic troubleshooting. Query Store is even more efficient in SQL Server 2017 compared to SQL Server 2016.

There were also several “community-driven” enhancements in SQL Server 2017, such as smart differential backup, smart transaction log backup, improved backup performance for small databases on high-end servers, and improved tempdb diagnostics and monitoring.

Replication Enhancements

Microsoft has made it possible to protect a remote distribution database in an Availability Group with SQL Server 2017 CU6. They have also added the ability to dynamically reload transaction Agent profile parameters in SQL Server 2017 CU3. This makes it much easier to tune and maintain your replication topology than it was in the past.

Microsoft Support

Once a major version of SQL Server falls out of Mainstream Support, there are no more Service Packs or Cumulative Updates. Only security updates are released while the major version is in Extended Support. After Extended Support ends, there are no more updates at all unless you purchase Microsoft Premium Assurance (which is only available for SQL Server 2008 and newer).

Recent evidence of this policy is the fact that Microsoft will not be releasing a Spectre/Meltdown security patch for SQL Server 2005. Once the General Data Protection Regulation (GDPR) goes into effect on May 25, 2018, older versions of SQL Server will be more vulnerable over time as they fall out of Microsoft Support.

SQL Server 2017 will be fully supported for about 15 months longer than SQL Server 2016. Microsoft has a page with many resources about the end of support for SQL Server 2008 and 2008 R2.

The Mainstream and Extended Support end dates for all recent versions of SQL Server are shown in Table 1.

SQL Server Version	End of Mainstream Support	End of Extended Support
SQL Server 2005	April 12, 2011	April 12, 2016
SQL Server 2008 SQL Server 2008 R2	July 8, 2014	July 9, 2019
SQL Server 2012	July 11, 2017	July 12, 2022
SQL Server 2014	July 9, 2019	July 9, 2024
SQL Server 2016	July 13, 2021	July 14, 2026
SQL Server 2017	October 11, 2022	October 12, 2027

Microsoft Servicing Policy

Microsoft SQL Server 2017 was released on October 2, 2017, and so far, there have been six SQL Server 2017 Cumulative Updates that have been released. Microsoft has adopted what they call the “Modern Servicing Model” for SQL Server 2017, which means that there will be no Service Packs for SQL Server 2017 or future versions.

Instead, there will be monthly Cumulative Updates for the first year, and then quarterly Cumulative Updates for the next four years after that. This means that defects will be fixed much more quickly during the first year after release. SQL Server 2017 has been released long enough that I consider it quite stable.

Not only does Microsoft correct specific defects in Cumulative Updates, they also release new functionality and other product improvements, which are quite often focused on improving performance. Here is the list of the updates and improvements for SQL Server 2017, as of CU6.

SQL Server 2017 CU6

SQL Server 2017 CU5

SQL Server 2017 CU4

Improvement: Improves the performance of SQL Server 2017 on smaller systems

SQL Server 2017 CU3

SQL Server 2017 CU2

SQL Server 2017 CU1

Even though you might think SQL Server 2017 was “just a Linux port,” there are actually many real-world improvements in the core Database Engine that will benefit all platforms. Microsoft continues to add extra functionality to the product in SQL Server 2017 Cumulative Updates, and SQL Server 2017 will be fully supported by Microsoft for a longer period of time than SQL Server 2016.

The post Reasons to Upgrade to SQL Server 2017 appeared first on SQLPerformance.com.

Properly Persisted Computed Columns

Paul White — Thu, 25 May 2017 13:26:38 +0000

Pretty much every computed column related performance problem I have encountered over the years has had one (or more) of the following root causes:

Implementation limitations
Lack of cost model support in the query optimizer
Computed column definition expansion before optimization starts

An example of an implementation limitation is not being able to create a filtered index on a computed column (even when persisted). There is not much we can do about this problem category; we have to use workarounds while we wait for product improvements to arrive.

The lack of optimizer cost model support means SQL Server assigns a small fixed cost to scalar computations, regardless of complexity or implementation. As a consequence, the server often decides to recompute a stored computed column value instead of reading the persisted or indexed value directly. This is particularly painful when the computed expression is expensive, for example when it involves calling a scalar user-defined function.

The problems around definition expansion are a bit more involved, and have wide-ranging effects.

The Problems of Computed Column Expansion

SQL Server normally expands computed columns into their underlying definitions during the binding phase of query normalization. This is a very early phase in the query compilation process, well before any plan selection decisions are made (including trivial plan).

In theory, performing early expansion might enable optimizations that would otherwise be missed. For example, the optimizer might be able to apply simplifications given other information in the query and metadata (e.g. constraints). This is the same sort of reasoning that leads to view definitions being expanded (unless a NOEXPAND hint is used).

Later in the compilation process (but still before even a trivial plan has been considered), the optimizer looks to match back expressions to persisted or indexed computed columns. The problem is that optimizer activities in the meantime may have changed the expanded expressions such that matching back is no longer possible.

When this occurs, the final execution plan looks as if the optimizer has missed an "obvious" opportunity to use a persisted or indexed computed column. There are few details in execution plans that can help determine the cause, making this a potentially frustrating issue to debug and fix.

Matching Expressions to Computed Columns

It is worth being especially clear that there are two separate processes here:

Early expansion of computed columns; and
Later attempts at matching expressions to computed columns.

In particular, note that any query expression may be matched to a suitable computed column later on, not just expressions that arose from expanding computed columns.

Computed column expression matching can enable plan improvements even when the text of the original query cannot be modified. For example, creating a computed column to match a known query expression allows the optimizer to use statistics and indexes associated with the computed column. This feature is conceptually similar to indexed view matching in Enterprise Edition. Computed column matching is functional in all editions.

From a practical point of view, my own experience has been that matching general query expressions to computed columns can indeed benefit performance, efficiency, and execution plan stability. On the other hand, I have rarely (if ever) found computed column expansion to be worthwhile. It just never seems to yield any useful optimizations.

Computed Column Uses

Computed columns that are neither persisted nor indexed have valid uses. For example, they can support automatic statistics if the column is deterministic and precise (no floating point elements). They can also be used to save storage space (at the expense of a little extra runtime processor use). As a final example, they can provide a neat way to ensure that a simple calculation is always performed correctly, rather than being explicitly written out in queries each time.

Persisted computed columns were added to the product specifically to allow indexes to be built on deterministic but "imprecise" (floating point) columns. In my experience, this intended usage is relatively rare. Maybe this is simply because I do not encounter floating point data very much.

Floating point indexes aside, persisted columns are pretty common. To some extent, this may be because inexperienced users assume that a computed column must always be persisted before it can be indexed. More experienced users may employ persisted columns simply because they have found that performance tends to be better that way.

Indexed computed columns (persisted or not) can be used to provide ordering and an efficient access method. It can be useful to store a computed value in an index without also persisting it in the base table. Equally, suitable computed columns may also be included in indexes rather than being key columns.

Poor Performance

A major cause of poor performance is a simple failure to use an indexed or persisted computed column value as expected. I have lost count of the number of questions I have had over the years asking why the optimizer would choose a terrible execution plan when an obviously better plan using an indexed or persisted computed column exists.

The precise cause in each case varies, but is almost always either a faulty cost-based decision (because scalars are assigned a low fixed cost); or a failure to match an expanded expression back to a persisted computed column or index.

The match-back failures are especially interesting to me, because they often involve complex interactions with orthogonal engine features. Equally often, the failure to "match back" leaves an expression (rather than a column) in a position in the internal query tree that prevents an important optimization rule from matching. In either case, the outcome is the same: a sub-optimal execution plan.

Now, I think it is fair to say that people generally index or persist a computed column with the strong expectation that the stored value will actually be used. It can come as quite a shock to see SQL Server recomputing the underlying expression each time, while ignoring the deliberately-provided stored value. People are not always very interested in the internal interactions and cost model deficiencies that led to the undesirable outcome. Even where workarounds exist, these require time, skill, and effort to discover and test.

In short: many people would simply prefer SQL Server to use the persisted or indexed value. Always.

A New Option

Historically, there has been no way to force SQL Server to always use the stored value (no equivalent to the NOEXPAND hint for views). There are some circumstances in which a plan guide will work, but it is not always possible to generate the required plan shape in the first place, and not all plan elements and positions can be forced (filters and compute scalars, for example).

There is still no neat, fully documented solution, but a recent update to SQL Server 2016 has provided an interesting new approach. It applies to SQL Server 2016 instances patched with at least Cumulative Update 2 for SQL Server 2016 SP1 or Cumulative Update 4 for SQL Server 2016 RTM.

The relevant update is documented in: FIX: Unable to rebuild the partition online for a table that contains a computed partitioning column in SQL Server 2016

As so often with support documentation, this does not say exactly what has been changed in the engine to address the issue. It certainly does not look terribly relevant to our current concerns, judging by the title and description. Nevertheless, this fix introduces a new supported trace flag 176, which is checked in a code method called FDontExpandPersistedCC. As the method name suggests, this prevents a persisted computed column from being expanded.

There are three important caveats to this:

The computed column must be persisted. Even if indexed, the column must also be persisted.
Match back from general query expressions to persisted computed columns is disabled.
The documentation does not describe the function of the trace flag, and does not prescribe it for any other use. If you choose to use trace flag 176 to prevent expansion of persisted computed columns, it will therefore be at your own risk.

This trace flag is effective as a start-up –T option, at both global and session scope using DBCC TRACEON, and per query with OPTION (QUERYTRACEON).

Example

This is a simplified version of a question (based on a real-world problem) that I answered on Database Administrators Stack Exchange a few years ago. The table definition includes a persisted computed column:

CREATE TABLE dbo.T
(   
    ID integer IDENTITY NOT NULL,
    A varchar(20) NOT NULL,
    B varchar(20) NOT NULL,
    C varchar(20) NOT NULL,
    D date NULL,
    Computed AS A + '-' + B + '-' + C PERSISTED,

    CONSTRAINT PK_T_ID 
        PRIMARY KEY CLUSTERED (ID),
);
GO
INSERT dbo.T WITH (TABLOCKX)
    (A, B, C, D)
SELECT
    A = STR(SV.number % 10, 2),
    B = STR(SV.number % 20, 2),
    C = STR(SV.number % 30, 2),
    D = DATEADD(DAY, 0 - SV.number, SYSUTCDATETIME())
FROM master.dbo.spt_values AS SV
WHERE SV.[type] = N'P';

The query below returns all rows from the table in a particular order, while also returning the next value of column D in the same order:

SELECT
    T1.ID,
    T1.Computed,
    T1.D,
    NextD =
    (
        SELECT TOP (1)
            t2.D
        FROM dbo.T AS T2
        WHERE
            T2.Computed = T1.Computed
            AND T2.D > T1.D
        ORDER BY
            T2.D ASC
    )
FROM dbo.T AS T1
ORDER BY
    T1.Computed, T1.D;

An obvious covering index to support the final ordering and lookups in the sub-query is:

CREATE UNIQUE NONCLUSTERED INDEX IX_T_Computed_D_ID
ON dbo.T (Computed, D, ID);

The execution plan delivered by the optimizer is surprising and disappointing:

The Index Seek on the inner side of the Nested Loops Join seems to be all good. The Clustered Index Scan and Sort on the outer input, however, is unexpected. We would have hoped to see an ordered scan of our covering nonclustered index instead.

We can force the optimizer to use the nonclustered index with a table hint:

SELECT
    T1.ID,
    T1.Computed,
    T1.D,
    NextD =
    (
        SELECT TOP (1)
            t2.D
        FROM dbo.T AS T2
        WHERE
            T2.Computed = T1.Computed
            AND T2.D > T1.D
        ORDER BY
            T2.D ASC
    )
FROM dbo.T AS T1
    WITH (INDEX(IX_T_Computed_D_ID)) -- New!
ORDER BY
    T1.Computed, T1.D;

The resulting execution plan is:

Scanning the nonclustered index removes the Sort, but adds a Key Lookup! The lookups in this new plan are surprising, given that our index definitely covers all columns needed by the query.

Looking at the properties of the Key Lookup operator:

For some reason, the optimizer has decided that three columns not mentioned in the query need to be fetched from the base table (since they are not present in our nonclustered index by design).

Looking around the execution plan, we discover that the looked-up columns are needed by the inner side Index Seek:

The first part of this seek predicate corresponds to the correlation T2.Computed = T1.Computed in the original query. The optimizer has expanded the definitions of both computed columns, but only managed to match back to the persisted and indexed computed column for the inner side alias T1. Leaving the T2 reference expanded has resulted in the outer side of the join needing to provide the base table columns (A, B, and C) needed to compute that expression for each row.

As is sometimes the case, it is possible to rewrite this query so that the problem goes away (one option is shown in my old answer to the Stack Exchange question). Using SQL Server 2016, we can also try trace flag 176 to prevent the computed columns being expanded:

SELECT
    T1.ID,
    T1.Computed,
    T1.D,
    NextD =
    (
        SELECT TOP (1)
            t2.D
        FROM dbo.T AS T2
        WHERE
            T2.Computed = T1.Computed
            AND T2.D > T1.D
        ORDER BY
            T2.D ASC
    )
FROM dbo.T AS T1
ORDER BY
    T1.Computed, T1.D
OPTION (QUERYTRACEON 176); -- New!

The execution plan is now much improved:

This execution plan contains only references to the computed columns. The Compute Scalars do nothing useful and would be cleaned up if the optimizer were a bit tidier around the house.

The important point is that the optimal index is now used correctly, and the Sort and Key Lookup have been eliminated. All by preventing SQL Server from doing something we would never have expected it to do in the first place (expanding a persisted and indexed computed column).

Using LEAD

The original Stack Exchange question was targeted at SQL Server 2008, where LEAD is not available. Let us try expressing the requirement on SQL Server 2016 using the newer syntax:

SELECT 
    T1.ID, 
    T1.Computed, 
    T1.D, 
    NextD =
        LEAD(T1.D) OVER (
            PARTITION BY T1.Computed 
            ORDER BY T1.D)
FROM dbo.T AS T1
ORDER BY 
    T1.Computed;

The SQL Server 2016 execution plan is:

This plan shape is quite typical for a simple row mode window function. The one unexpected item is the Sort operator in the middle. If the data set were large, this Sort could have a big impact on performance and memory usage.

The issue, once again, is computed column expansion. In this case, one of the expanded expressions sits in a position that prevents normal optimizer logic simplifying the Sort away.

Trying exactly the same query with trace flag 176:

SELECT 
    T1.ID, 
    T1.Computed, 
    T1.D, 
    NextD =
        LEAD(T1.D) OVER (
            PARTITION BY T1.Computed 
            ORDER BY T1.D)
FROM dbo.T AS T1
ORDER BY 
    T1.Computed
OPTION (QUERYTRACEON 176);

Produces the plan:

The Sort has disappeared as it should. Note also in passing that this query qualified for a trivial plan, avoiding cost-based optimization altogether.

Disabled General Expression Matching

One of the caveats mentioned earlier was that trace flag 176 also disables matching from expressions in the source query to persisted computed columns.

To illustrate, consider the following version of the example query. The LEAD computation has been removed, and the references to the computed column in the SELECT and ORDER BY clauses have been replaced with the underlying expressions. Run it first without trace flag 176:

SELECT 
    T1.ID, 
    Computed = T1.A + '-' + T1.B + '-' + T1.C,
    T1.D
FROM dbo.T AS T1
ORDER BY 
    T1.A + '-' + T1.B + '-' + T1.C;

The expressions are matched to the persisted computed column, and the execution plan is a simple ordered scan of the nonclustered index:

The Compute Scalar there is once again just leftover architectural junk.

Now try the same query with trace flag 176 enabled:

SELECT 
    T1.ID, 
    Computed = T1.A + '-' + T1.B + '-' + T1.C,
    T1.D
FROM dbo.T AS T1
ORDER BY 
    T1.A + '-' + T1.B + '-' + T1.C
OPTION (QUERYTRACEON 176); -- New!

The new execution plan is:

The Nonclustered Index Scan has been replaced with a Clustered Index Scan. The Compute Scalar evaluates the expression, and the Sort orders by the result. Deprived of the ability to match expressions to persisted computed columns, the optimizer cannot make use of the persisted value, or the nonclustered index.

Note that the expression matching limitation only applies to persisted computed columns when trace flag 176 is active. If we make the computed column indexed but not persisted, expression matching works correctly.

In order to drop the persisted attribute, we need to drop the nonclustered index first. Once the change is made we can put the index straight back (because the expression is deterministic and precise):

DROP INDEX IX_T_Computed_D_ID ON dbo.T;
GO
ALTER TABLE dbo.T
ALTER COLUMN Computed
DROP PERSISTED;
GO
CREATE UNIQUE NONCLUSTERED INDEX IX_T_Computed_D_ID
ON dbo.T (Computed, D, ID);

The optimizer now has no problems matching the query expression to the computed column when trace flag 176 is active:

-- Computed column no longer persisted
-- but still indexed. TF 176 active.
SELECT 
    T1.ID, 
    Computed = T1.A + '-' + T1.B + '-' + T1.C,
    T1.D
FROM dbo.T AS T1
ORDER BY 
    T1.A + '-' + T1.B + '-' + T1.C
OPTION (QUERYTRACEON 176);

The execution plan returns to the optimal nonclustered index scan without a sort:

To summarize: Trace flag 176 prevents persisted computed column expansion. As a side-effect, it also prevents query expression matching to persisted computed columns only.

Schema metadata is only loaded once, during the binding phase. Trace flag 176 prevents expansion so the computed column definition is not loaded at that time. Later expression-to-column matching cannot work without the computed column definition to match against.

The initial metadata load brings in all columns, not just those referenced in the query (that optimization is performed later). This makes all computed columns available for matching, which is generally a good thing. Unfortunately, if one of the loaded computed columns contains a scalar user-defined function, its presence disables parallelism for the whole query even when the problematic column is not used. Trace flag 176 can help with this as well, if the column in question is persisted. By not loading the definition, a scalar user-defined function is never present, so parallelism is not disabled.

Final Thoughts

It seems to me that SQL Server world be a better place if the optimizer treated persisted or indexed computed columns more like regular columns. In almost all instances, this would better match developer expectations than the current arrangement. Expanding computed columns into their underlying expressions and later attempting to match them back is not as successful in practice as theory might suggest.

Until SQL Server provides specific support to prevent persisted or indexed computed column expansion, new trace flag 176 is a tempting option for SQL Server 2016 users, albeit an imperfect one. It is a little unfortunate that it disables general expression matching as a side effect. It is also a shame that the computed column has to be persisted when indexed. There is then the risk of using a trace flag for other than its documented purpose to consider.

It is fair to say that the majority of problems with computed column queries can ultimately be resolved in other ways, given sufficient time, effort, and expertise. On the other hand, trace flag 176 often seems to work like magic. The choice, as they say, is yours.

To finish, here are some interesting computed column problems which benefit from trace flag 176:

Computed Column Index Not Used
PERSISTED computed column not used in windowing function partitioning
Persisted computed column causing scan
Computed Column Index Not Used with MAX data types
Severe performance issue with persisted computed columns and joins
Why does SQL Server "Compute Scalar" when I SELECT a persisted computed column?
Base Columns used instead of persisted computed column by engine
Computed Column with UDF disables parallelism for queries on *other* columns

The post Properly Persisted Computed Columns appeared first on SQLPerformance.com.

SQL Server 2016 Enterprise Edition Performance Advantages

Glenn Berry — Thu, 22 Dec 2016 14:27:57 +0000

On November 16, 2016, Microsoft announced some very significant changes for SQL Server 2016 Standard Edition, which were implemented in SQL Server 2016 Service Pack 1 (Build 13.0.4001.0). Many very useful programmability-related features that were previously only available in Enterprise Edition will now be available in Standard Edition (and also in Web Edition and even Express Edition).

Once you have a database application using SQL Server 2016 Standard Edition Service Pack 1 (or even a lower edition), you can just perform an edition upgrade to Enterprise Edition to get even more scalability and performance, taking advantage of the higher license limits for sockets, cores, and memory in Enterprise Edition, as detailed here.

You will also get the many other intrinsic performance benefits that are present in Enterprise Edition, along with multiple manageability improvements that make your life so much easier as a DBA.

Columnstore Indexes

If you are using Columnstore indexes, you get the following performance benefits automatically, when you use Enterprise Edition:

Aggregate Pushdown: This performance feature often gives a 2X-4X query performance gain by pushing qualifying aggregates to the SCAN node, which reduces the number of rows coming out of that iterator.
Index Build/Rebuild: Enterprise Edition can build/rebuild columnstore indexes with multiple processor cores, while Standard Edition only uses one processor core. This has a pretty significant effect on elapsed times for these operations, depending on your hardware.
Local Aggregates: Enterprise Edition can use local aggregations to filter the number of rows passing out of a SCAN node, reducing the amount of work that needs to be done by subsequent query nodes. You can confirm this by looking for the “ActualLocallyAggregatedRows” attribute in the XML of the execution plan for the query.
Single Instruction Multiple Data (SIMD) Optimizations: This feature uses a set of hardware instructions which are capable of processing an array of data in a single instruction, dramatically speeding up aggregate operations. These hardware instructions are present on all modern processors (that have AVX support), but they are only used by Enterprise Edition.
String Predicate Pushdown: This performance feature can improve performance of queries using predicate(s) on string columns by pushing these predicates to the SCAN node. This can greatly reduce the amount of work that needs to be done by subsequent nodes.
Degree of Parallelism: Batch mode queries are limited to MAXDOP = 2 on Standard Edition. Enterprise Edition can use all of the cores that are present for the instance. This can be very significant on larger queries on typical, modern server hardware.
Memory limits: The Columnstore object pool is limited to 32GB per instance on Standard Edition. Enterprise Edition does not have any memory limitation for the Columnstore object pool.

In order to test these performance assertions, I ran some fairly simple tests on the Microsoft ContosoRetailDW database on my Intel Core i7-6700K workstation. I have two named instances of SQL Server 2016 SP1 installed, one using Standard Edition, and the other using Developer Edition (which is equivalent to Enterprise Edition).

All instance-level and database-level configurations and properties are identical between the two instances, and the user and tempdb database file locations are in separate directories on the same, separate flash storage device for each instance. The database compatibility level was changed to 130 in both cases, and the underlying Windows configuration and hardware is the same for both instances. The only difference here is the Edition of each instance.

The first test is a simple query (adapted from Niko Neugebauer) that allows SQL Server 2016 to use aggregate pushdown on the FactOnlineSales table. The results are shown in Table 1.

Edition	Elapsed Time (ms)
Standard Edition	30
Developer Edition	1
Time Difference	29
% Improvement	96.7%

Table 1: Aggregate Pushdown Comparison

The next test is timing how long it takes to build a clustered columnstore index on the 12.6 million row FactOnlineSales table. The results are shown in Table 2.

Edition	Elapsed Time (ms)
Standard Edition	42,197
Developer Edition	14,384
Time Difference	27,813
% Improvement	65.9%

Table 2: Building Clustered Columnstore Index Comparison

The next test is timing how long it takes to rebuild a clustered columnstore index on the same FactOnlineSales table. The results are shown in Table 3.

Edition	Elapsed Time (ms)
Standard Edition	33,105
Developer Edition	11,460
Time Difference	21,645
% Improvement	65.4%

Table 3: Rebuilding Clustered Columnstore Index Comparison

The next test is another simple query that allows SQL Server 2016 to use local aggregation on the FactOnlineSales table. The results are shown in Table 4.

Edition	Elapsed Time (ms)
Standard Edition	122
Developer Edition	83
Time Difference	39
% Improvement	32.0%

Table 4: Local Aggregation Comparison

The next test is another simple query that allows SQL Server 2016 to use string predicate pushdown on the FactOnlineSales and DimPromotion tables. The results are shown in Table 5.

Edition	Elapsed Time (ms)
Standard Edition	2,683
Developer Edition	1,221
Time Difference	1,466
% Improvement	54.6%

Table 5: String Predicate Pushdown Comparison

These are just some simple examples of the built-in performance advantages for Columnstore indexes in SQL Server 2016 Enterprise Edition compared to SQL Server 2016 Standard Edition on the same hardware. If you want to really dive into Columnstore indexes (which can be a very effective feature for some workloads), you should bookmark and read Niko Neugebauer’s long series of posts at columnstore.net.

DBCC CHECKDB Performance

Another manageability performance improvement that is present on SQL Server 2016 Enterprise Edition is DBCC CHECKDB performance. On Standard Edition, DBCC CHECKDB only uses one processor core, while it can use all available cores on Enterprise Edition. This behavior is unchanged from previous versions of SQL Server. SQL Server 2016 does allow you to restrict the number of cores that DBCC CHECKDB can use with a new WITH (MAXDOP = x) option.

Running DBCC CHECKDB with the WITH PHYSICAL_ONLY option on a somewhat larger database (about 38GB) that I have, yielded the results shown in Table 6.

Edition	Elapsed Time (ms)
Standard Edition	58,492
Developer Edition	24,897
Time Difference	33,595
% Improvement	57.4%

Table 6: DBCC CHECKDB WITH PHYSICAL_ONLY Comparison

Running a standard DBCC CHECKDB on the same database yielded the results shown in Table 7.

Edition	Elapsed Time (ms)
Standard Edition	435,039
Developer Edition	119,767
Time Difference	315,272
% Improvement	72.5%

Table 7: DBCC CHECKDB Comparison

A very important factor in DBCC CHECKDB performance is the sequential read performance from all of the LUNs where your database data file(s) are located. You can easily check this by running a BACKUP DATABASE command to a NUL device (making sure to use the COPY_ONLY and NO_COMPRESSION options). This will show you your effective sequential read performance, as shown in this example from my workstation:

BACKUP DATABASE successfully processed 5048514 pages in 16.115 seconds (2447.502 MB/sec).

Keep in mind that all of this testing was done on a single, quad-core desktop processor. A multi-socket server with many more total processor cores will show even more of a performance improvement in many of these tests.

The point of all of this is to show a few tangible examples of the performance improvements you can see, simply by upgrading from SQL Server 2016 Standard Edition SP1 to SQL Server 2016 Enterprise Edition SP1, on the same hardware, making no database or application changes. This list is by no means exhaustive, as there are many other benefits as well.

The post SQL Server 2016 Enterprise Edition Performance Advantages appeared first on SQLPerformance.com.

Memory Limits in SQL Server 2016 SP1

Aaron Bertrand — Mon, 12 Dec 2016 14:33:02 +0000

A few weeks ago, I made a pretty big deal about SQL Server 2016 Service Pack 1. Many features previously reserved for Enterprise Edition were unleashed to lower editions, and I was ecstatic to learn about these changes.

Nonetheless, I'm seeing a few people who are, let's say, a bit less excited than I am.

It's important to keep in mind that the changes here weren't meant to provide complete feature parity across all editions; they were for the specific purpose of creating a more consistent programming surface area. Now customers can use features like In-Memory OLTP, Columnstore, and compression without worrying about the targeted edition(s) – only about how well they will scale. Several security features that didn't really seem to have anything to do with edition are opened up as well. The one I understood the least was Always Encrypted; I couldn't fathom why only Enterprise customers needed to protect things like credit card data. Transparent Data Encryption is still Enterprise-only, on versions earlier than SQL Server 2019, because this isn't really a programmability feature (either it's on or it's not).

So what's really in it for Standard Edition customers?

I think the biggest problem most people have is that max memory in Standard Edition is still limited to 128GB. They look at that and say, "Gee, thanks for all the features, but the memory limit means I can't really use them."

However, the surface area changes bring about performance improvement opportunities, even if that wasn't their original intention (or even if it was – I wasn't in any of those meetings). Let's take a closer look at a small section of the fine print (from the official docs):

Memory limits for Enterprise/Standard in SQL Server 2016 SP1

The astute reader will notice that the buffer pool limit wording has changed, from:

Memory: Maximum memory utilized per instance

To:

Memory: Maximum buffer pool size per instance

This is a better description of what really happens in Standard Edition: a 128GB limit for the buffer pool only, and other memory reservations can be over and above that (think pools like the plan cache). So, in effect, a Standard Edition server could use 128GB of buffer pool, then max server memory could be higher and support more memory used for other reservations. Similarly, Express Edition is now properly documented to use 1.4GB for the buffer pool.

You might also notice some very specific wording in that left-most column (e.g. "per instance" and "per database") for the features that are being exposed in Standard Edition for the first time. To be more specific:

The instance is limited to 128GB of memory for the buffer pool.
The instance can have an additional 32GB allocated to Columnstore objects, over and above the buffer pool limit.
Each user database on the instance can have an additional 32GB allocated to memory-optimized tables, over and above the buffer pool limit.

And to be crystal clear: These memory limits for ColumnStore and In-Memory OLTP are NOT subtracted from the buffer pool limit, as long as the server has more than 128GB of memory available. If the server has less than 128GB, you will see these technologies compete with buffer pool memory, and in fact be limited to a % of max server memory. More details are available in this post from Microsoft's Parikshit Savjani.

I don't have hardware handy to test the extent of this, but if you had a machine with 256GB or 512GB of memory, you could theoretically use it all with a single Standard Edition instance, if you could – for example – spread your In-Memory data across databases in <= 32GB chunks, for a total of 128GB + (32GB * (# of databases)). If you wanted to use ColumnStore instead of In-Memory, you could spread your data across multiple instances, giving you (128GB + 32GB) * (# of instances). And you could combine these strategies for ((128GB + 32GB ColumnStore) * (# of instances)) + (32GB In-Memory * (# of databases * # of instances)).

Whether breaking your data up in this way is practical for your application, I'm not sure; I'm only suggesting it's possible. Some of you might already be doing some of these things to get better usage out of Standard Edition on servers with more than 128GB of memory.

With ColumnStore specifically, in addition to being allowed to use 32GB in addition to the buffer pool, keep in mind that the compression you can get here means you can often fit a lot more into that 32GB limit than you could with the same data in traditional row-store. And if you can't use ColumnStore for whatever reason (or it still won't fit into 32GB), you can now implement traditional page or row compression – this might not allow you to fit your entire database into the 128GB buffer pool, but it might enable more of your data to be in memory at any given time.

Similar things are possible in Express (at a lower scale), where you can have 1.4GB for buffer pool, but an additional ~352MB per instance for ColumnStore, and ~352MB per database for In-Memory OLTP.

But Enterprise Edition still has lots of upside

There are many other differentiators to keep interest in Enterprise Edition, aside from unlimited memory limits all around – from online rebuilds and merry-go-round scans to full-on Availability Groups and all the virtualization rights you can shake a stick at. Even ColumnStore indexes have well-defined performance enhancements reserved for Enterprise Edition.

So just because there are some techniques that will allow you to get more out of Standard Edition, that does not mean it will magically scale to meet your performance needs. Like my other posts about "doing it on a budget" (e.g. partitioning and readable secondaries), you can certainly spend time and effort kludging together a solution, but it will only get you so far. The point of this post was simply to demonstrate that you can get farther with Standard Edition in 2016 SP1 than you ever could before.

The post Memory Limits in SQL Server 2016 SP1 appeared first on SQLPerformance.com.

A Big Deal : SQL Server 2016 Service Pack 1

Aaron Bertrand — Wed, 16 Nov 2016 15:43:13 +0000

Well, I can't claim that you heard it here first, since it was announced today by Scott Guthrie (@ScottGu) at Connect(); // 2016 and also on the SQL Server team blog, but SQL Server 2016 Service Pack 1 is the build of SQL Server that will finally make just about everyone upgrade from all previous versions.

Why? Well, I hope you're sitting down:

Many Enterprise features are now available in Standard, Web, and Express.

This is a BIG DEAL.

I've long been an advocate of making the surface area the same across all editions, and restricting each edition to hardware/resource limits. The latter is already the case with Express Edition (1 CPU, 1 GB memory, 10 GB database size) and Standard Edition (24 cores, 128 GB), but features themselves simply don't exist there. While they did add some functionality around Availability Groups (limited support in Standard Edition, and the ability to forego a domain), many of the features that are currently Enterprise only don't seem to be Enterprise-y at all. Here are some reasons and consequences of putting so many critical features only in Enterprise, at least in my opinion:

Data security seems to have taken a back seat in recent years. It should be in all editions; I was disappointed when Always Encrypted was released and it was restricted to Enterprise. The implicit message was that, if you're a small business, you don't need to be able to encrypt credit card data (they say the same with Transparent Data Encryption, but Always Encrypted is a much more valuable offering).
Programmability features that moonlight as performance features – like data compression and partitioning – are arguably on the Enterprise side of things, even though I've always felt data compression's sweet spot is the small business who can't afford to just add memory, buy faster disks, or upgrade to Enterprise.

In current versions, when a vendor wants to take advantage of compression or partitioning, they have to consider that not all of their customers will be using Enterprise Edition. Similarly, customers want to use the same codebase in their QA, production, and development environments, but those may not all have the same edition either. Both vendors and customers also want to be able to use features that are available in both cloud and box, regardless of tier or edition. Currently, they either need to write multiple codebases, or simply surrender to the lowest common denominator.
Adoption rates are slow because, in many respects, Standard Edition upgrades have been far from compelling in recent years, meaning many people are content with their 2012 Standard instances. They just don't really gain much from moving to a more recent version and, in many cases, the cost of testing a migration and dealing with regressions actually outweighs any benefits.

Colleagues like Brent Ozar and Steve Jones have had similar thoughts – arguing that the best features remain excluded from Standard and lower editions for revenue reasons, but this isn't the way it should be.

And now it's not.

Feature	Standard / Web	Express	LocalDB
Always Encrypted
Change Data Capture
Columnstore¹
Data Compression
Database Snapshots
Dynamic Data Masking
Fine-Grained Auditing
In-Memory OLTP²
Multiple Filestream Containers
Partitioning
PolyBase
Row-Level Security

1. Limited to 32GB per instance on Standard Edition, 352MB on Express.
2. Limited to 32GB per database on Standard Edition, 352MB on Express.

I don't have any evidence of this, but I believe that due to slow adoption rates of SQL Server 2016 – and Standard Edition specifically – Microsoft has finally listened. Perhaps they've realized that making so many compelling features only available in Enterprise Edition doesn't work. Can't get blood from a stone, they say.

The table here shows the set of features now enabled outside of Enterprise Edition (some features were already there in Standard, but I didn't have a good way to illustrate that). It only has to be a table because of three features that don't work on Express and/or LocalDB (due to SQL Server Agent, permissions, or external dependencies).

That's my quick cheat sheet; check the documentation for the latest updates. I've tested most of the features and they work as you would expect – not to trivialize the effort, but they just removed the checks that blocked functionality based on edition. I thought I would trip them up by testing beyond the basics, like incremental statistics on partitioned tables and combinations like Columnstore over a compressed table, to see if they missed something. But no – everything in the table worked just as if I were running Enterprise Edition.

That said, the focus here was for a consistent programming surface area (CPSA). So no, you won't suddenly be getting all of the benefits of Enterprise Edition for free or at a substantial discount – see below for functionality that won't work. And some of the features above may still be scale-limited based on edition.

Nonetheless, this does provide a great solution to #2 above: Having many of these features work in all editions, even if they support different levels of scale, will make it easier for both customers and vendors to build solutions that work no matter where they're deployed. This is going to be a huge win, even beyond the pure benefit of finally being able to use things like Always Encrypted and partitioning everywhere.

What Still Differentiates Editions?

Essentially, if it was an Enterprise-level operational or availability feature before, or within a certain set of performance features, it will remain as such. The following is not an exhaustive list, but probably captures most of the big differences:

Availability features like online operations, piecemeal restore, and fully functional Availability Groups (e.g. read-only replicas) are still Enterprise only. I wish the REBUILD + (ONLINE = ON) syntax could work as a no-op in lower editions with a warning instead of returning an error, but I can't be too fussy here.
Performance features like parallelism still won't work in Express Edition (or LocalDB). Automatic indexed view usage without NOEXPAND hints, and high-end features like hot-add memory/CPU, will continue to be available only in Enterprise.
Operational features like Resource Governor, Extensible Key Management (EKM), and Transparent Data Encryption will remain Enterprise Edition only. Others, like Backup Encryption, Backup Compression, and Buffer Pool Extension, will continue to work in Standard, but will still not function in Express.
SQL Server Agent is still unavailable in Express and LocalDB. As a result, and as noted above, Change Data Capture will not work. Cross-server Service Broker also remains unavailable in these editions.
In-Memory OLTP and PolyBase are supported in Express, but unavailable in LocalDB.
Virtualization Rights haven't changed and are still much more valuable in Enterprise Edition with Software Assurance.
Resource limits on downlevel editions remain the same. Seems the bean-counters still control the upper memory limit in Standard Edition, so this is still 128 GB (while Enterprise Edition is now 24 TB). Personally, I think this upper limit should make incremental gains with each new version (Moore's Law), because I feel that Standard Edition is expensive enough that its memory limits should never be so dangerously close to the upper bound of a well-equipped laptop. But I'll take what I can get, and when you are stuck on Standard Edition and scale is required, you can now use the above Enterprise features across multiple Standard Edition boxes or instances, instead of trying to scale up.

Summary

This is a bold move by Microsoft, and I can certainly see how this will increase Standard sales without cannibalizing Enterprise sales. Standard Edition customers can use these features both to consolidate their codebases and, in many scenarios, build solutions that offer better performance. And for those who already enjoy all the benefits of Enterprise Edition, this may not be all that interesting to you, but there are also a bunch of new features available across all editions of SP1. So regardless of what edition you're on now, you really have no excuse for holding on to that ancient version of SQL Server.

Go upgrade to SQL Server 2016 SP1!

The post A Big Deal : SQL Server 2016 Service Pack 1 appeared first on SQLPerformance.com.

Stretch Database in SQL Server 2016 RTM

Tim Radney — Thu, 23 Jun 2016 09:00:36 +0000

Back in August 2015, I wrote an article introducing the new Stretch Database feature in SQL Server 2016. In that article, I discussed how to get started with Stretch Database in SQL Server 2016 Community Technology Preview 2 (CTP2). SQL Server 2016 was released on June 1^st 2016, and there have been numerous updates to the product. The method of setting up Stretch Database has changed slightly as well as some of the features.

Beginning with SQL Server 2016 we gained the ability to store portions of a database in an Azure SQL Database. In earlier previews when you enabled Stretch for a database, you had to migrate the entire table, with the RTM release of SQL Server 2016, you can now chose a portion of a table. Once you enable stretch for a table, it will silently migrate your data. If you are not familiar with Stretch Database, it leverages the processing power in Azure to run queries against the remote data by rewriting the query. You do not have to rewrite any queries on your end. You will see this as a “remote query” operator in the query plan.

An easy way to identify databases and tables that are eligible for being Stretch-enabled is to download and run the SQL Server 2016 Upgrade Advisor and run the Stretch Database Advisor. Aaron Bertrand (@AaronBertrand) wrote about this awhile back. The Upgrade Advisor has changed slightly since Aaron’s post, however the process is mostly the same:

Identify Candidate Tables for SQL Server 2016 Stretch Databases

Limitations for Stretch Database

Not all tables will be eligible for being Stretch-enabled. Certain table properties, data and column types, constraints, and indexes are not supported, such as:

Memory-optimized and replicated tables
Tables that contain FILESTREAM data, use Change Tracking or Change Data Capture
Data types such as timestamp, sql_variant, XML, or geography
Check or default constraints
Foreign key constraints that reference the table
XML, full-text, spatial, or clustered columnstore indexes
Indexed views that reference the table
You cannot run UPDATE or DELETE statements, or run CREATE INDEX or ALTER INDEX operations on a Stretch-enabled table

For a full listing of limitations, you can visit: Requirements and limitations for Stretch Database.

Setting up Stretch Database

Getting started with the RTM release is a bit different than the earlier previews. You’ll need an Azure account, and then you must enable Stretch Database on the local instance.

To enable Stretch Database on an instance, run:

EXEC sys.sp_configure N'remote data archive', '1';
RECONFIGURE;
GO

For this demo I’m going to use a sample database that I have created called STRETCH. I started by right clicking on the database, choosing Tasks, Stretch, and then chose Enable. This was using SQL Server 2016 Management Studio.

The next screen offers you which tables you'd like to enable for Stretch:

I chose the SALES2 table. The wizard defaults to “Entire Table,” but you can also change that option to migrate a subset of rows.

If you choose by rows, you have to select a name for your criteria, and then you can choose which column to use in your where statement, as well as the condition and value. In this screen shot I chose rows prior to 2016. Being able to choose a portion of a table is a huge improvement over the earlier previews, which only allowed you to stretch the entire table. For simplicity, in this demo, I am going to migrate the entire table, so I clicked Cancel, and then Next.

Once you have your tables and conditions selected, you have to choose which Azure subscription you are going to use, your Azure region, and your server information.

Once you have entered the required information, click Next.

A new enhancement is using the database master key to protect the Azure credentials to connect to Azure. If you do not have a master key already, you will be prompted to create one, if you already have one, you will need to provide the password. Click Next.

You will need to create a firewall rule for your server, or you can enter a subnet IP range. Make your selection and click Next.

This is where things have really changed, and will make me reconsider using this feature. Microsoft has created a Database Stretch Unit (DSU) so that you can scale up or down the level of performance that you need for Stretch data. As of June 2016, current pricing is billed for both compute and storage, which you see represented in the image above. For my 15MB table that was migrated, I would be charged $61 USD per month for storage, as well as the minimum DSU level (100) at $912.50 per month. DSU levels range from:

DSU level	Hourly cost	Max monthly cost (months with 31 days)
100	$1.25	$930
200	$2.50	$1,860
300	$3.75	$2,790
400	$5.00	$3,720
500	$6.25	$4,650
600	$7.50	$5,580
1000	$12.50	$9,300
1200	$15.00	$11,160
1500	$18.75	$13,950
2000	$25.00	$18,600

Why did the wizard tell me only $912.50, when the price sheet indicates it should be $900 for June (or pro-rated based on how many days are left in June)? Your guess is as good as mine; I've tried various math stuff and came up blank. You can learn more about the pricing models here:

SQL Server Stretch Database Pricing

Prior to learning about this new billing method for DSU, I could make the argument that using Stretch Database would be a very cost effective method for storing cold data (unused data) into the cloud. By stretching this data into Azure, you could migrate a large portion of older data, which would decrease the size (and thus cost) of your local backups. In the event you had to restore a database, you would simply have to establish the connection to Azure for the stretched data, thus eliminating the need to restore it. However, with the minimal cost being nearly $1,000 per month for the low end DSU scale, many organizations will find that it is much cheaper to retain the data on a less expensive tier of storage within their data center and find other methods for HA such as mirroring, log shipping, or Availability Groups.

Click Finish to begin the migration.

Congratulations, I have now migrated the SALES2 table to an Azure SQL Database

Disable a Stretch table

In the early previews of Stretch Database, if you wanted to disable a Stretch table, you would have to create a new table and insert the stretch data into it. Once all data was copied, then you would have to either manually switch out the tables by renaming them, or manually merging the stretched data back into the production table. With the RTM release, you can still manually handle the migration, chose to leave the data in Azure, or chose an option to bring data back from Azure.

Regardless of which method you use to bring the data back, you incur data transfer charges.

Backup and Restore of a Stretch Database

Once you migrate data into a Stretch Database, Azure handles the backup of the Stretch data. Backups occur with a snapshot taken every 8 hours and the snapshots are maintained for 7 days. This gives you up to 21 points-in-time over the previous 7 days to restore.

You don’t have to make any changes to your current local backup routines. Any local backups taken will contain all local data and eligible data that has not yet been migrated. This is referred to as a shallow backup and doesn’t contain any data already migrated to Azure.

DBCC CHECKDB

You also cannot run CHECKDB against data that has been migrated to Azure.

When I ran DBCC CHECKDB on my STRETCH database prior to migration I got the following results for the SALES2 table:

DBCC results for 'SALES2'.
There are 45860 rows in 1901 pages for object "SALES2".

After the migration, I received the following output for the SALES2 table (emphasis mine):

DBCC results for 'SALES2'.
There are 0 rows in 1901 pages for object "SALES2".

You can run DBCC CHECKDB against Azure SQL Database, however due to not being able to connect directly to the stretched Azure SQL Database, you currently cannot manually run DBCC CHECKDB against the stretched data specifically. I cannot find any documentation that states Azure is performing any consistency checks against these databases.

This brings up a significant risk in my opinion.

Summary

Stretch Database is an easy way to migrate archive data to Microsoft Azure, if your database supports it. Currently in SQL Server 2016 RTM there are many limitations with table, data, and column properties, data and column types, constraints, and indexes. If you are not restricted by those limitations, then Stretch Database is a simple way to migrate historical data to Azure SQL Database to free up local storage and decrease restore times of those databases if the expense makes it worthwhile. You also need to be comfortable, at least for now, with not being able to run DBCC CHECKDB against any migrated data. Managing restores will be a bit trickier too with having to restore the connection between the SQL Server database and the remote Azure database.

The post Stretch Database in SQL Server 2016 RTM appeared first on SQLPerformance.com.

SQL Server 2016 Temporal Table Query Plan Behaviour

Rob Farley — Tue, 14 Jun 2016 00:20:09 +0000

Isn’t it great to have a new version of SQL Server available? This is something that only happens every couple of years, and this month we saw one reach General Availability. (Ok, I know we get a new version of SQL Database in Azure almost continuously, but I count this as different.) Acknowledging this new release, this month’s T-SQL Tuesday (hosted by Michael Swart – @mjswart) is on the topic of all things SQL Server 2016!

So today I want to look at SQL 2016’s Temporal Tables feature, and have a look at some query plan situations you could end up seeing. I love Temporal Tables, but have come across a bit of a gotcha that you might want to be aware of.

Now, despite the fact that SQL Server 2016 is now in RTM, I’m using AdventureWorks2016CTP3, which you can download here – but don’t just download AdventureWorks2016CTP3.bak, also grab SQLServer2016CTP3Samples.zip from the same site.

You see, in the Samples archive, there are some useful scripts for trying out new features, including some for Temporal Tables. It’s win-win – you get to try a bunch of new features, and I don’t have to repeat so much script in this post. Anyway, go and grab the two scripts about Temporal Tables, running AW 2016 CTP3 Temporal Setup.sql, followed by Temporal System-Versioning Sample.sql.

These scripts set up temporal versions of a few tables, including HumanResources.Employee. It creates HumanResources.Employee_Temporal (although, technically, it could’ve been called anything). At the end of the CREATE TABLE statement, this bit appears, adding two hidden columns to use to indicate when the row is valid, and indicating that a table should be created called HumanResources.Employee_Temporal_History to store the old versions.

  ...
  ValidFrom datetime2(7) GENERATED ALWAYS AS ROW START HIDDEN NOT NULL,
  ValidTo datetime2(7)   GENERATED ALWAYS AS ROW END   HIDDEN NOT NULL,
  PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo)
) WITH (SYSTEM_VERSIONING = ON 
  (HISTORY_TABLE = [HumanResources].[Employee_Temporal_History])
);

What I want to explore in this post is what happens with query plans when the history is used.

If I query the table to see the latest row for a particular BusinessEntityID, I get a Clustered Index Seek, as expected.

SELECT e.BusinessEntityID, e.ValidFrom, e.ValidTo
FROM HumanResources.Employee_Temporal AS e
WHERE e.BusinessEntityID = 4;

I’m sure I could query this table using other indexes, if it had any. But in this case, it doesn’t. Let’s create one.

CREATE UNIQUE INDEX rf_ix_Login 
on HumanResources.Employee_Temporal(LoginID);

Now I can query the table by LoginID, and will see a Key Lookup if I ask for columns other than Loginid or BusinessEntityID. None of this is surprising.

SELECT * FROM HumanResources.Employee_Temporal e
WHERE e.LoginID = N'adventure-works\rob0';

Let’s use SQL Server Management Studio for a minute, and have a look at how this table looks in Object Explorer.

We can see the History table mentioned under HumanResources.Employee_Temporal, and the columns and indexes from both the table itself and the history table. But while the indexes on the proper table are the Primary Key (on BusinessEntityID) and the index I had just created, the History table doesn’t have matching indexes.

The index on the history table is on ValidTo and ValidFrom. We can right-click the index and select Properties, and we see this dialog:

A new row is inserted into this History table when it is no longer valid in the main table, because it has just been deleted or changed. The values in the ValidTo column are naturally populated with the current time, so ValidTo acts as an ascending key, like an identity column, so that new inserts appear at the end of the b-tree structure.

But how does this perform when you want to query the table?

If we want to query our table for what was current at a particular point in time, then we should use a query structure such as:

SELECT * FROM HumanResources.Employee_Temporal
FOR SYSTEM_TIME AS OF '20160612 11:22';

This query needs to concatenate the appropriate rows from the main table with the appropriate rows from the history table.

In this scenario, the rows that were valid for the moment I picked were all from the history table, but nonetheless, we see a Clustered Index Scan against the main table, which was filtered by a Filter operator. The predicate of this filter is:

[HumanResources].[Employee_Temporal].[ValidFrom] <= '2016-06-12 11:22:00.0000000' 
AND [HumanResources].[Employee_Temporal].[ValidTo] > '2016-06-12 11:22:00.0000000'

Let’s revisit this in a moment.

The Clustered Index Seek on the History table must clearly be leveraging a Seek Predicate on ValidTo. The Start of the Seek’s Range Scan is HumanResources.Employee_Temporal_History.ValidTo > Scalar Operator('2016-06-12 11:22:00'), but there is no End, because every row that has a ValidTo after the time we care about is a candidate row, and must be tested for an appropriate ValidFrom value by the Residual Predicate, which is HumanResources.Employee_Temporal_History.ValidFrom <= '2016-06-12 11:22:00'.

Now, intervals are hard to index for; that’s a known thing that has been discussed on many blogs. Most effective solutions consider creative ways to write queries, but no such smarts have been built into Temporal Tables. You can, though, put indexes on other columns too, such as on ValidFrom, or even have indexes that match the types of queries you might have on the main table. With a clustered index being a composite key on both ValidTo and ValidFrom, these two columns get included on every other column, providing a good opportunity for some Residual Predicate testing.

If I know which loginid I’m interested in, my plan forms a different shape.

The top branch of the Concatenation operator looks similar to before, although that Filter operator has entered the fray to remove any rows that are not valid, but the Clustered Index Seek on the lower branch has a Warning. This is a Residual Predicate warning, like the examples in an earlier post of mine. It’s able to filter to entries that are valid until some point after the time we care about, but the Residual Predicate now filters to the LoginID as well as ValidFrom.

[HumanResources].[Employee_Temporal_History].[ValidFrom] <= '2016-06-12 11:22:00.0000000' 
AND [HumanResources].[Employee_Temporal_History].[LoginID] = N'adventure-works\rob0'

Changes to rob0’s rows are going to be a tiny proportion of the rows in the History. This column won’t be unique like in the main table, because the row may have been changed multiple times, but there is still a good candidate for indexing.

CREATE INDEX rf_ixHist_loginid
ON HumanResources.Employee_Temporal_History(LoginID);

This new index has a notable effect on our plan.

It’s now changed our Clustered Index Seek into a Clustered Index Scan!!

You see, the Query Optimizer now works out that the best thing to do would be to use the new index. But it also decides that the effort in having to do lookups to get all the other columns (because I was asking for all columns) would be simply too much work. The tipping point was reached (sadly an incorrect assumption in this case), and a Clustered Index SCAN chosen instead. Even though without the non-clustered index, the best option would have been to use a Clustered Index Seek, when the non-clustered index has been considered and rejected for tipping-point reasons, it chooses to scan.

Frustratingly, I’ve only just created this index and its statistics should be good. It should know that a Seek that requires exactly one lookup should be better than a Clustered Index Scan (only by statistics – if you were thinking it should know this because LoginID is unique in the main table, remember that it may not always have been). So I suspect that lookups should be avoided in history tables, although I haven’t done quite enough research into this yet.

Now were we to only query columns that appear in our non-clustered index, we would get much better behaviour. Now that no lookup is required, our new index on the history table is happily used. It still needs to apply a Residual Predicate based on only being able to filter to LoginID and ValidTo, but it behaves much better than dropping into a Clustered Index Scan.

SELECT LoginID, ValidFrom, ValidTo
FROM HumanResources.Employee_Temporal
FOR SYSTEM_TIME AS OF '20160612 11:22'
WHERE LoginID = N'adventure-works\rob0'

So do index your history tables in extra ways, considering how you will be querying them. Include the necessary columns to avoid lookups, because you’re really avoiding Scans.

These history tables can grow large if data is changing frequently. So be mindful of how they are being handled. This same situation occurs when using the other FOR SYSTEM_TIME constructs, so you should (as always) review the plans your queries are producing, and index to make sure that you are well positioned to leverage what is a very powerful feature of SQL Server 2016.

The post SQL Server 2016 Temporal Table Query Plan Behaviour appeared first on SQLPerformance.com.

STRING_SPLIT() in SQL Server 2016 : Follow-Up #2

Aaron Bertrand — Fri, 22 Apr 2016 09:00:21 +0000

Earlier this week, I posted a follow-up to my my recent post about STRING_SPLIT() in SQL Server 2016, addressing several comments left on the post and/or sent to me directly:

STRING_SPLIT() in SQL Server 2016 : Follow-Up #1

After that post was mostly written, there was a late-breaking question from Doug Ellner:

How do these functions compare with table-valued parameters?

Now, testing TVPs was already on my list of future projects, after a recent twitter exchange with @Nick_Craver over at Stack Overflow. He said they were excited that STRING_SPLIT() performed well, because they were unhappy with the performance of sending ~7,000 values in through a table-valued parameter.

My Tests

For these tests, I used SQL Server 2016 RC3 (13.0.1400.361) on an 8-core Windows 10 VM, with PCIe storage and 32 GB of RAM.

I created a simple table that mimicked what they were doing (selecting about 10,000 values from a 3+ million row posts table), but for my tests, it has far fewer columns and fewer indexes:

CREATE TABLE dbo.Posts_Regular
(
  PostID   int PRIMARY KEY,
  HitCount int NOT NULL DEFAULT 0
);

INSERT dbo.Posts_Regular(PostID) 
  SELECT TOP (3000000) ROW_NUMBER() OVER (ORDER BY s1.[object_id])
  FROM sys.all_objects AS s1 
  CROSS JOIN sys.all_objects AS s2;

I also created an In-Memory version, because I was curious if any approach would work differently there:

CREATE TABLE dbo.Posts_InMemory
(
  PostID   int PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 4000000),
  HitCount int NOT NULL DEFAULT 0
) WITH (MEMORY_OPTIMIZED = ON);

Now, I wanted to create a C# app that would pass in 10,000 unique values, either as a comma-separated string (built using a StringBuilder) or as a TVP (passed from a DataTable). The point would be to retrieve or update a selection of rows based on a match, either to an element produced by splitting the list, or an explicit value in a TVP. So the code was written to append every 300th value to the string or DataTable (C# code is in an appendix below). I took the functions I created in the original post, altered them to handle varchar(max), and then added two functions that accepted a TVP – one of them memory-optimized. Here are the table types (the functions are in the appendix below):

CREATE TYPE dbo.PostIDs_Regular AS TABLE(PostID int PRIMARY KEY);
GO

CREATE TYPE dbo.PostIDs_InMemory AS TABLE
(
  PostID int NOT NULL PRIMARY KEY NONCLUSTERED HASH 
  WITH (BUCKET_COUNT = 1000000)
) 
WITH (MEMORY_OPTIMIZED = ON);
GO

I also had to make the Numbers table bigger in order to handle strings > 8K and with > 8K elements (I made it 1MM rows). Then I created seven stored procedures: five of them taking a varchar(max) and joining with the function output in order to update the base table, and then two to accept the TVP and join directly against that. The C# code calls each of these seven procedures, with the list of 10,000 posts to select or update, 1,000 times. These procedures are also in the appendix below. So just to summarize, the methods being tested are:

Native (STRING_SPLIT())
XML
CLR
Numbers table
JSON (with explicit int output)
Table-valued parameter
Memory-optimized table-valued parameter

We'll test retrieving the 10,000 values, 1,000 times, using a DataReader – but not iterating over the DataReader, since that would just make the test take longer, and would be the same amount of work for the C# application regardless of how the database produced the set. We'll also test updating the 10,000 rows, 1,000 times each, using ExecuteNonQuery(). And we'll test against both the regular and memory-optimized versions of the Posts table, which we can switch very easily without having to change any of the functions or procedures, using a synonym:

CREATE SYNONYM dbo.Posts FOR dbo.Posts_Regular;

-- to test memory-optimized version:
DROP SYNONYM dbo.Posts;
CREATE SYNONYM dbo.Posts FOR dbo.Posts_InMemory;

-- to test the disk-based version again:
DROP SYNONYM dbo.Posts;
CREATE SYNONYM dbo.Posts FOR dbo.Posts_Regular;

I kicked off the application, ran it several times for each combination to ensure compilation, caching, and other factors weren't being unfair to the batch executed first, and then analyzed the results from the logging table (I also spot-checked sys.dm_exec_procedure_stats to make sure none of the approaches had significant application-based overhead, and they did not).

Results – Disk-Based Tables

I struggle with data visualization sometimes – I really tried to come up with a way to represent these metrics on a single chart, but I think there were just far too many data points to make the salient ones stand out.

You can click to enlarge any of these in a new tab/window, but even if you have a small window I tried to make the winner clear through use of color (and the winner was the same in every case). And to be clear, by "Average Duration" I mean the average amount of time it took for the application to complete a loop of 1,000 operations.

Average Duration (milliseconds) for SELECTs against disk-based Posts table

Average Duration (milliseconds) for UPDATEs against disk-based Posts table

The most interesting thing here, for me, is how poorly the memory-optimized TVP did when assisting with an UPDATE. It turns out that parallel scans are currently blocked too aggressively when DML is involved; Microsoft has recognized this as a feature gap, and they are hoping to address it soon. Note that parallel scan is currently possible with SELECT but it is blocked for DML right now. (It will not be resolved in SQL Server 2014, as these specific parallel scan operations are not available there for any operation.) When that is fixed, or when your TVPs are smaller and/or parallelism isn't beneficial anyway, you should see that memory-optimized TVPs will perform better (the pattern just doesn't work well for this particular use case of relatively large TVPs).

For this specific case, here are the plans for the SELECT (which I could coerce to go parallel) and the UPDATE (which I could not):

Parallelism in a SELECT plan joining a disk-based table to an in-memory TVP

No parallelism in an UPDATE plan joining a disk-based table to an in-memory TVP

Results – Memory-Optimized Tables

A little more consistency here – the four methods on the right are relatively even, while the three on the left seem very undesirable by contrast. Also pay particular attention to absolute scale compared to the disk-based tables – for the most part, using the same methods, and even without parallelism, you end up with much quicker operations against memory-optimized tables, leading to lower overall CPU usage.

Average Duration (milliseconds) for SELECTs against memory-optimized Posts table

Average Duration (milliseconds) for UPDATEs against memory-optimized Posts table

Conclusion

For this specific test, with a specific data size, distribution, and number of parameters, and on my particular hardware, JSON was a consistent winner (though marginally so). For some of the other tests in previous posts, though, other approaches fared better. Just an example of how what you're doing and where you're doing it can have a dramatic impact on the relative efficiency of various techniques, here are the things I've tested in this brief series, with my summary of which technique to use in that case, and which to use as a 2nd or 3rd choice (for example, if you can't implement CLR due to corporate policy or because you're using Azure SQL Database, or you can't use JSON or STRING_SPLIT() because you aren't on SQL Server 2016 yet). Note that I didn't go back and re-test the variable assignment and SELECT INTO scripts using TVPs – these tests were set up assuming you already had existing data in CSV format that would have to be broken up first anyway. Generally, if you can avoid it, don't smoosh your sets into comma-separated strings in the first place, IMHO.

Goal	1st choice	2nd choice (and 3rd, where appropriate)
Simple variable assignment	STRING_SPLIT()	CLR if < 2016 XML if no CLR and < 2016
SELECT INTO	CLR	XML if no CLR
SELECT INTO (no spool)	CLR	Numbers table if no CLR
SELECT INTO (no spool + MAXDOP 1)	STRING_SPLIT()	CLR if < 2016 Numbers table if no CLR and < 2016
SELECT joining large list (disk-based)	JSON (int)	TVP if < 2016
SELECT joining large list (memory-optimized)	JSON (int)	TVP if < 2016
UPDATE joining large list (disk-based)	JSON (int)	TVP if < 2016
UPDATE joining large list (memory-optimized)	JSON (int)	TVP if < 2016

For Doug's specific question: JSON, STRING_SPLIT(), and TVPs performed rather similarly across these tests on average – close enough that TVPs are the obvious choice if you're not on SQL Server 2016. If you have different use cases, these results may differ. Greatly.

Which brings us to the moral of this story: I and others may perform very specific performance tests, revolving around any feature or approach, and come to some conclusion about which approach is fastest. But there are so many variables, I will never have the confidence to say "this approach is always the fastest." In this scenario, I tried very hard to control most of the contributing factors, and while JSON won in all four cases, you can see how those different factors affected execution times (and drastically so for some approaches). So it is always worth it to construct your own tests, and I hope I have helped illustrate how I go about that sort of thing.

Appendix A : Console Application Code

Please, no nit-picking about this code; it was literally thrown together as a very simple way to run these stored procedures 1,000 times with true lists and DataTables assembled in C#, and to log the time each loop took to a table (to be sure to include any application-related overhead with handling either a large string or a collection). I could add error handling, loop differently (e.g. construct the lists inside the loop instead of reuse a single unit of work), and so on.

using System;
using System.Text;
using System.Configuration;
using System.Data;
using System.Data.SqlClient;
 
namespace SplitTesting
{
  class Program
  {
    static void Main(string[] args)
    {
      string operation = "Update";
      if (args[0].ToString() == "-Select") { operation = "Select"; }
      var csv = new StringBuilder();
      DataTable elements = new DataTable();
      elements.Columns.Add("value", typeof(int));
      for (int i = 1; i <= 10000; i++)
      {
        csv.Append((i*300).ToString());
        if (i < 10000) { csv.Append(","); }
        elements.Rows.Add(i*300);
      }
      string[] methods = { "Native", "CLR", "XML", "Numbers", "JSON", "TVP", "TVP_InMemory" };

      using (SqlConnection con = new SqlConnection())
      {
        con.ConnectionString = ConfigurationManager.ConnectionStrings["primary"].ToString();
        con.Open();
        SqlParameter p;

        foreach (string method in methods)
        {
          SqlCommand cmd = new SqlCommand("dbo." + operation + "Posts_" + method, con);
          cmd.CommandType = CommandType.StoredProcedure;
          if (method == "TVP" || method == "TVP_InMemory")
          {
            cmd.Parameters.Add("@PostList", SqlDbType.Structured).Value = elements;
          }
          else
          {
            cmd.Parameters.Add("@PostList", SqlDbType.VarChar, -1).Value = csv.ToString();
          }

          var timer = System.Diagnostics.Stopwatch.StartNew();
          for (int x = 1; x <= 1000; x++)
          {
            if (operation == "Update") { cmd.ExecuteNonQuery(); }
            else { SqlDataReader rdr = cmd.ExecuteReader(); rdr.Close(); }
          }
          timer.Stop();
          long this_time = timer.ElapsedMilliseconds;

          // log time - the logging procedure adds clock time and 
          // records memory/disk-based (determined via synonym)

          SqlCommand log = new SqlCommand("dbo.LogBatchTime", con);
          log.CommandType = CommandType.StoredProcedure;
          log.Parameters.Add("@Operation", SqlDbType.VarChar, 32).Value = operation;
          log.Parameters.Add("@Method", SqlDbType.VarChar, 32).Value = method;
          log.Parameters.Add("@Timing", SqlDbType.Int).Value = this_time;
          log.ExecuteNonQuery();

          Console.WriteLine(method + " : " + this_time.ToString());
        }
      }
    }
  }
}

Sample usage:

SplitTesting.exe -Select
SplitTesting.exe -Update

Appendix B : Functions, Procedures, and Logging Table

Here were the functions edited to support varchar(max) (the CLR function already accepted nvarchar(max) and I was still reluctant to try to change it):

CREATE FUNCTION dbo.SplitStrings_Native( @List varchar(max), @Delimiter char(1))
RETURNS TABLE WITH SCHEMABINDING
AS
    RETURN (SELECT [value] FROM STRING_SPLIT(@List, @Delimiter));
GO

CREATE FUNCTION dbo.SplitStrings_XML( @List varchar(max), @Delimiter char(1))
RETURNS TABLE WITH SCHEMABINDING
AS
   RETURN (SELECT [value] = y.i.value('(./text())[1]', 'varchar(max)')
      FROM (SELECT x = CONVERT(XML, '' + REPLACE(@List, @Delimiter, '') 
        + '').query('.')) AS a CROSS APPLY x.nodes('i') AS y(i));
GO

CREATE FUNCTION dbo.SplitStrings_Numbers( @List varchar(max), @Delimiter char(1))
RETURNS TABLE WITH SCHEMABINDING
AS
  RETURN (SELECT [value] = SUBSTRING(@List, Number, 
       CHARINDEX(@Delimiter, @List + @Delimiter, Number) - Number)
     FROM dbo.Numbers WHERE Number <= CONVERT(INT, LEN(@List))
     AND SUBSTRING(@Delimiter + @List, Number, LEN(@Delimiter)) = @Delimiter
  );
GO

CREATE FUNCTION dbo.SplitStrings_JSON( @List varchar(max), @Delimiter char(1))
RETURNS TABLE WITH SCHEMABINDING
AS
    RETURN (SELECT [value] FROM OPENJSON(CHAR(91) + @List + CHAR(93)) WITH (value int '$'));
GO

And the stored procedures looked like this:

CREATE PROCEDURE dbo.UpdatePosts_Native
  @PostList varchar(max)
AS
BEGIN
  UPDATE p SET HitCount += 1
  FROM dbo.Posts AS p
  INNER JOIN dbo.SplitStrings_Native(@PostList, ',') AS s
  ON p.PostID = s.[value];
END
GO
CREATE PROCEDURE dbo.SelectPosts_Native
  @PostList varchar(max)
AS
BEGIN
  SELECT p.PostID, p.HitCount
  FROM dbo.Posts AS p
  INNER JOIN dbo.SplitStrings_Native(@PostList, ',') AS s
  ON p.PostID = s.[value];
END
GO
-- repeat for the 4 other varchar(max)-based methods

CREATE PROCEDURE dbo.UpdatePosts_TVP
  @PostList dbo.PostIDs_Regular READONLY -- switch _Regular to _InMemory
AS
BEGIN
  SET NOCOUNT ON;

  UPDATE p SET HitCount += 1
  FROM dbo.Posts AS p
  INNER JOIN @PostList AS s
  ON p.PostID = s.PostID;
END
GO
CREATE PROCEDURE dbo.SelectPosts_TVP
  @PostList dbo.PostIDs_Regular READONLY -- switch _Regular to _InMemory 
AS
BEGIN
  SET NOCOUNT ON;

  SELECT p.PostID, p.HitCount
  FROM dbo.Posts AS p
  INNER JOIN @PostList AS s
  ON p.PostID = s.PostID;
END
GO
-- repeat for in-memory

And finally, the logging table and procedure:

CREATE TABLE dbo.SplitLog
(
  LogID           int IDENTITY(1,1) PRIMARY KEY,
  ClockTime       datetime          NOT NULL DEFAULT GETDATE(),
  OperatingTable  nvarchar(513)     NOT NULL, -- Posts_InMemory or Posts_Regular
  Operation       varchar(32)       NOT NULL DEFAULT 'Update', -- or select
  Method          varchar(32)       NOT NULL DEFAULT 'Native', -- or TVP, JSON, etc.
  Timing          int NOT NULL DEFAULT 0
);
GO

CREATE PROCEDURE dbo.LogBatchTime
  @Operation  varchar(32),
  @Method     varchar(32),
  @Timing     int
AS
BEGIN
  SET NOCOUNT ON;

  INSERT dbo.SplitLog(OperatingTable, Operation, Method, Timing) 
    SELECT base_object_name, @Operation, @Method, @Timing
    FROM sys.synonyms WHERE name = N'Posts';
END
GO

-- and the query to generate the graphs:

;WITH x AS
(
  SELECT OperatingTable,Operation,Method,Timing,
    Recency = ROW_NUMBER() OVER 
      (PARTITION BY OperatingTable,Operation,Method 
       ORDER BY ClockTime DESC)
  FROM dbo.SplitLog
)
SELECT OperatingTable,Operation,Method,AverageDuration = AVG(1.0*Timing)
FROM x WHERE Recency <= 3
GROUP BY OperatingTable,Operation,Method;

The post STRING_SPLIT() in SQL Server 2016 : Follow-Up #2 appeared first on SQLPerformance.com.

STRING_SPLIT() in SQL Server 2016 : Follow-Up #1

Aaron Bertrand — Tue, 19 Apr 2016 09:00:24 +0000

A few weeks ago, I wrote about how surprised I was at the performance of a new native function in SQL Server 2016, STRING_SPLIT():

Performance Surprises and Assumptions : STRING_SPLIT()

After the post was published, I got a few comments (publicly and privately) with these suggestions (or questions that I turned into suggestions):

Specifying an explicit output data type for the JSON approach, so that that method doesn't suffer from potential performance overhead due to the fallback of nvarchar(max).
Testing a slightly different approach, where something is actually done with the data – namely SELECT INTO #temp.
Showing how estimated row counts compare to existing methods, particularly when nesting split operations.

I did respond to some people offline, but thought it would be worth posting a follow-up here.

Being fairer to JSON

The original JSON function looked like this, with no specification for output data type:

CREATE FUNCTION dbo.SplitStrings_JSON
...
    RETURN (SELECT value FROM OPENJSON( CHAR(91) + @List + CHAR(93) ));

I renamed it, and created two more, with the following definitions:

CREATE FUNCTION dbo.SplitStrings_JSON_int
...
    RETURN (SELECT value FROM OPENJSON( CHAR(91) + @List + CHAR(93) ) 
      WITH ([value] int '$'));
GO

CREATE FUNCTION dbo.SplitStrings_JSON_varchar
...
    RETURN (SELECT value FROM OPENJSON( CHAR(91) + @List + CHAR(93) ) 
      WITH ([value] varchar(100) '$'));

I thought this would drastically improve performance, but alas, this was not the case. I ran the tests again and the results were as follows:

The waits observed during a random instance of the test (filtered to those > 25):

CLR	IO_COMPLETION	1,595
	SOS_SCHEDULER_YIELD	76
	RESERVED_MEMORY_ALLOCATION_EXT	76
	MEMORY_ALLOCATION_EXT	28
JSON_int	MEMORY_ALLOCATION_EXT	6,294
JSON_int	SOS_SCHEDULER_YIELD	95
JSON_original	MEMORY_ALLOCATION_EXT	4,307
JSON_original	SOS_SCHEDULER_YIELD	83
JSON_varchar	MEMORY_ALLOCATION_EXT	6,110
JSON_varchar	SOS_SCHEDULER_YIELD	87
Numbers	SOS_SCHEDULER_YIELD	96
XML	MEMORY_ALLOCATION_EXT	1,917
	IO_COMPLETION	1,616
	SOS_SCHEDULER_YIELD	147
	RESERVED_MEMORY_ALLOCATION_EXT	73

Waits observed > 25 (note there is no entry for STRING_SPLIT)

While changing from the default to varchar(100) did improve performance a little, the gain was negligible, and changing to int actually made it worse. Add to this that you probably need to add STRING_ESCAPE() to the incoming string in some scenarios, just in case they have characters that will mess up JSON parsing. My conclusion is still that this is a neat way to use the new JSON functionality, but mostly a novelty inappropriate for reasonable scale.

Materializing the Output

Jonathan Magnan made this astute observation on my previous post:

STRING_SPLIT is indeed very fast, however also slow as hell when working with temporary table (unless it get fixed in a future build).

SELECT f.value
INTO #test
FROM dbo.SourceTable AS s
CROSS APPLY string_split(s.StringValue, ',') AS f

Will be WAY slower than SQL CLR solution (15x and more!).

So, I dug in. I created code that would call each of my functions and dump the results into a #temp table, and time them:

SET NOCOUNT ON;

SELECT N'SET NOCOUNT ON;
TRUNCATE TABLE dbo.Timings;
GO
';

SELECT N'DECLARE @d DATETIME = SYSDATETIME();
 
INSERT dbo.Timings(dt, test, point, wait_type, wait_time_ms)
SELECT @d, test = ''' + name + ''', point  = ''Start'', wait_type, wait_time_ms
FROM sys.dm_exec_session_wait_stats WHERE session_id = @@SPID;
GO
 
SELECT f.value
INTO #test
FROM dbo.SourceTable AS s
CROSS APPLY dbo.'+name+'(s.StringValue, '','') AS f;
GO

DECLARE @d DATETIME = SYSDATETIME();
 
INSERT dbo.Timings(dt, test, point, wait_type, wait_time_ms)
SELECT @d, '''+name+''', ''End'', wait_type, wait_time_ms
FROM sys.dm_exec_session_wait_stats WHERE session_id = @@SPID;

DROP TABLE #test;
GO'
FROM sys.objects WHERE name LIKE '%split%';

I just ran each test once (rather than loop 100 times), because I didn't want to completely thrash the I/O on my system. Still, after averaging three test runs, Jonathan was absolutely, 100% right. Here were the durations of populating a #temp table with ~500,000 rows using each method:

So here, the JSON and STRING_SPLIT methods took about 10 seconds each, while the Numbers table, CLR, and XML approaches took less than a second. Perplexed, I investigated the waits, and sure enough, the four methods on the left incurred significant LATCH_EX waits (about 25 seconds) not seen in the other three, and there were no other significant waits to speak of.

And since the latch waits were greater than total duration, it gave me a clue that this had to do with parallelism (this particular machine has 4 cores). So I generated test code again, changing just one line to see what would happen without parallelism:

CROSS APPLY dbo.'+name+'(s.StringValue, '','') AS f OPTION (MAXDOP 1);

Now STRING_SPLIT fared a lot better (as did the JSON methods), but still at least double the time taken by CLR:

So, there might be a remaining issue in these new methods when parallelism is involved. It wasn't a thread distribution issue (I checked that), and CLR actually had worse estimates (100x actual vs. just 5x for STRING_SPLIT); just some underlying issue with coordinating latches amongst threads I suppose. For now, it might be worthwhile using MAXDOP 1 if you know you are writing the output onto new pages.

I've included the graphical plans comparing the CLR approach to the native one, for both parallel and serial execution (I've also uploaded a Query Analysis file that you can open up in SQL Sentry Plan Explorer to snoop around on your own):

STRING_SPLIT

CLR

The sort warning, FYI, was nothing too shocking, and obviously didn't have much tangible effect on the query duration:

StringSplit.queryanalysis.zip (25kb)

Spools Out For Summer

When I looked a little closer at those plans, I noticed that in the CLR plan, there is a lazy spool. This is introduced to make sure that duplicates are processed together (to save work by doing less actual splitting), but this spool is not always possible in all plan shapes, and it can give a bit of an advantage to those that can use it (e.g. the CLR plan), depending on estimates. To compare without spools, I enabled trace flag 8690, and ran the tests again. First, here is the parallel CLR plan without the spool:

And here were the new durations for all queries going parallel with TF 8690 enabled:

Now, here is the serial CLR plan without the spool:

And here were the timing results for queries using both TF 8690 and MAXDOP 1:

(Note that, other than the XML plan, most of the others didn't change at all, with or without the trace flag.)

Comparing estimated rowcounts

Dan Holmes asked the following question:

How does it estimate the data size when joined to another (or multiple) split function? The link below is a write up of a CLR Based split implementation. Does the 2016 do a 'better' job with data estimates? (unfortunately i don't have hte ability to install the RC yet).

http://sql.dnhlms.com/2016/02/sql-clr-based-string-splitting-and.html

So, I swiped the code from Dan's post, changed it to use my functions, and ran it through Plan Explorer:

DECLARE @s VARCHAR(MAX);

SELECT * FROM dbo.SplitStrings_CLR(@s, ',') s
CROSS APPLY   dbo.SplitStrings_CLR(s.value, ';') s1 
CROSS APPLY   dbo.SplitStrings_CLR(s1.value, '!') s2
CROSS APPLY   dbo.SplitStrings_CLR(s2.value, '#') s3;

SELECT * FROM dbo.SplitStrings_Numbers(@s, ',') s
CROSS APPLY   dbo.SplitStrings_Numbers(s.value, ';') s1 
CROSS APPLY   dbo.SplitStrings_Numbers(s1.value, '!') s2
CROSS APPLY   dbo.SplitStrings_Numbers(s2.value, '#') s3;

SELECT * FROM dbo.SplitStrings_Native(@s, ',') s
CROSS APPLY   dbo.SplitStrings_Native(s.value, ';') s1 
CROSS APPLY   dbo.SplitStrings_Native(s1.value, '!') s2
CROSS APPLY   dbo.SplitStrings_Native(s2.value, '#') s3;

The SPLIT_STRING approach certainly comes up with *better* estimates than CLR, but still grossly over (in this case, when the string is empty; this might not always be the case). The function has a built-in default that estimates the incoming string will have 50 elements, so when you nest them you get 50 x 50 (2,500); if you nest them again, 50 x 2,500 (125,000); and then finally, 50 x 125,000 (6,250,000):

Note: OPENJSON() behaves the exact same way as STRING_SPLIT – it, too, assumes 50 rows will come out of any given split operation. I am thinking that it might be useful to have a way to hint cardinality for functions like this, in addition to trace flags like 4137 (pre-2014), 9471 & 9472 (2014+), and of course 9481…

This 6.25 million row estimate is not great, but it is much better than the CLR approach that Dan was talking about, which estimates A TRILLION ROWS, and I lost count of the commas to determine data size – 16 petabytes? exabytes?

Some of the other approaches obviously fare better in terms of estimates. The Numbers table, for example, estimated a much more reasonable 438 rows (in SQL Server 2016 RC2). Where does this number come from? Well, there are 8,000 rows in the table, and if you remember, the function has both an equality and an inequality predicate:

     WHERE Number <= LEN(@List)
     AND SUBSTRING(@Delimiter + @List, [Number], 1) = @Delimiter

So, SQL Server multiplies the number of rows in the table by 10% (as a guess) for the equality filter, then the square root of 30% (again, a guess) for the inequality filter. The square root is due to exponential backoff, which Paul White explains here. This gives us:

8000 * 0.1 * SQRT(0.3) = 438.178

The XML variation estimated a little over a billion rows (due to a table spool estimated to be executed 5.8 million times), but its plan was far too complex to try to illustrate here. In any case, remember that estimates clearly don't tell the whole story - just because a query has more accurate estimates does not mean it will perform better.

There were a few other ways I could tweak the estimates a bit: namely, forcing the old cardinality estimation model (which affected both the XML and Numbers table variations), and using TFs 9471 and 9472 (which affected only the Numbers table variation, since they both control cardinality around multiple predicates). Here were the ways I could change the estimates just a little bit (or A LOT, in the case of reverting to the old CE model):

The old CE model brought the XML estimates down by an order of magnitude, but for the Numbers table, completely blew it up. The predicate flags altered the estimates for the Numbers table, but those changes are much less interesting.

None of these trace flags had any effect on the estimates for the CLR, JSON, or STRING_SPLIT variations.

Conclusion

So what did I learn here? A whole bunch, actually:

Parallelism can help in some cases, but when it doesn't help, it really doesn't help. The JSON methods were ~5x faster without parallelism, and STRING_SPLIT was nearly 10x faster.
The spool actually helped the CLR approach perform better in this case, but TF 8690 might be useful to experiment with in other cases where you're seeing spools and are trying to improve performance. I am certain there are situations where eliminating the spool will end up being better overall.
Eliminating the spool really hurt the XML approach (but only drastically so when it was forced to be single-threaded).
Lots of funky things can happen with estimates depending on the approach, along with the usual statistics, distribution, and trace flags. Well, I suppose I already knew that, but there are definitely a couple of good, tangible examples here.

Thank you to the folks who asked questions or prodded me to include more information. And as you might have guessed from the title, I address yet another question in a second follow-up, this one about TVPs:

STRING_SPLIT() in SQL Server 2016 : Follow-Up #2

The post STRING_SPLIT() in SQL Server 2016 : Follow-Up #1 appeared first on SQLPerformance.com.