Getting Started - What's New - Version 5.0
New Features In This Release
Improved Query Optimization And Other Performance Improvements
We have done a deep dive into the way VistaDB processes queries and implemented a new approach to query optimization. The new optimization emphasizes:
- Improving the performance of common queries (even if they're already fast)
- Addressing situations where straightforward queries had unexpectedly bad performance.
Common simple queries are frequently 25% faster than VistaDB 4.3 with other queries improving up to 75%. In most cases, no changes to client applications are necessary to take advantage of the performance improvements.
Multi-Column Indexes
Previously, VistaDB used multi-column indexes only for uniqueness constraints, it couldn't employ them as part of a filter or join operation. Now VistaDB will examine the WHERE clause and JOIN conditions and attempt to find a matching index which covers all of the necessary columns in the table.
ORDER BY Improvements
Queries which specify an ORDER BY that can be supported by an available index and do not require other filtering can now take advantage of that index to return results in that order as a "live" query rather than having to spool all results and sort them before returning any results (as a non-live query). This allows such queries, when limited by TOP (or OFFSET/FETCH) to, process fewer rows (instead of all of them) and complete efficiently even for very large tables. When other filtering (e.g. column comparisons in WHERE or ON clauses) is specified such filtering will typically take priority when selecting the "active" index order and require spooling for a final sort if the ORDER BY is not consistent with the same "active" index.
This improvement primarily affects queries such as SELECT TOP 100 * FROM [BigTable] ORDER BY [IndexedColumn]; which have previously completed in time scaling linearly with table size (getting much slower for very large tables). Applicable queries should now perform in time scaling approximately linearly with the number of TOP rows (or OFFSET + FETCH rows) and logarithmically (or so) with table size (due to depth of index) making it now practical to perform such queries on tables with millions of rows. Only the forward order of an index is currently supported for a “live” ORDER BY query; the reverse ordering will still require a non-live query (spool and sort) as before.
Composite Memory Index
The binary in-memory index form VistaDB has historically used for filtering data when there are multiple constraints or multiple columns. This comes in particularly handy in the cases where there are multiple logical clauses being OR'd together that don't optimize out at runtime.
In the case where the Composite Memory Index is being used additional optimizations have been created specifically for sparse results (where most of the table rows are not being included) that can skip to the next block with the first row included instead of checking each item against the bitmap. This can dramatically improve the performance for large tables.
The Composite Memory Index has been integrated with the new ORDER BY capability to prevent it from spooling results into memory first and then sorting them for the return value when that would take significant memory.
Multiple Temporary Indexes per Table
VistaDB has been able to generate a temporary index in cases where there is a filter clause on a column and no index. This process has been optimized by allowing it to make multiple single-column indexes on the same table (useful for the OR cases associated with Composite Memory Indexes) and allowed it to keep these indexes through an entire expression tree which allows a subquery and parent query to reuse the same temporary indexes.
Improved Sub-query Performance
Compared to prior versions of VistaDB subqueries can now reuse a significant amount of their work between top-level rows compared to prior versions of VistaDB. Depending on the type of work done in the subqueries this can have a dramatic effect on overall query performance.
Reduced Engine Overhead
The amount of memory used by VistaDB has been optimized to let it both take advantage of higher memory systems and scale back to smaller systems even with large tables or very large columns. The impact on the .NET Garbage Collector has been notably reduced for wide tables (with many columns) to reduce the potential for application impact and improve scaling across multiple cores.
CLR Procedure Performance
Internal changes to how CLR Procedures are invoked reduce the call overhead by 80%. For simple CLR Procedures this can translate directly to a similar improvement in overall performance.
New SQL Features Supported
For VistaDB 5 several new TSQL language features are now supported:
- SELECT now supports ANSI standard OFFSET/FETCH paging syntax (as in SQL Server 2012) as part of the ORDER BY clause.
- The various DateTime functions now support part specifiers for MICROSECOND (MCS) and NANOSECOND (NS).
- The CONVERT function now supports an integer-valued expression as the format argument, not just a literal integer.
New Data Types have also been added:
- DATETIMEOFFSET, DATETIME2, DATE, TIME Data Types: These are all now supported by VistaDB.
- DOUBLE PRECISION is now parsed and accepted (as another name for FLOAT).
- FLOAT (and DOUBLE PRECISION) now parse and accept an optional bit-length specifier up to 53 (using 24 or less declares as REAL instead) as in SQL Server.
- BINARY(n), VARBINARY(n): Both type declarations previously mapped to the same VARBINARY type and ignored any specified length—other than MAX—for an implicit limit of 64k which disqualified it from inclusion in an index. The new separate BINARY(n) and VARBINARY(n) types can specify lengths up to the standard limit of 8000 and can be used in an index. Migration of older VistaDB databases containing BINARY/VARBINARY columns will map them to VARBINARY(MAX) in VistaDB 5 to preserve data.
Improved Compatibility with T-SQL Scripting
- Semicolons (;) Optional: The SQL Parser now supports dropping the semicolon at the end of statements like SQL Server in nearly all cases, making it much easier to port code from SQL Server where this requirement was never enforced.
- SET Options: The parser now accepts ANSI standard options for already-supported behavior: SET NOCOUNT ON, SET ANSI_NULLS ON, SET QUOTED_IDENTIFIER ON. These do not accept OFF because VistaDB only ever supported the standard ON behavior, not the pre-ANSI behavior.
- CREATE FUNCTION AS: AS is now optional in CREATE FUNCTION between parameter declaration and BEGIN.
- CREATE PROCEDURE: Parenthesis are now allowed (but optional) around the parameter list. Previously they could be supplied but would fail to execute.
- Improved CONVERT: Formats used by the Convert function have been made more fully consistent with standard.
- Improved Variable Scoping: Variables declared in a query now follow Visual Basic-style scoping (added to the entire query context), as they are in SQL Server, rather than C#-style scoping (within the encapsulating BEGIN…END block).
Proper Implementation of Rand()
Previously, Rand() returned a new pseudo-random value for each row and each reference within a SELECT. Rand(int) would return a single fixed value based on the next pseudo-random value after the given seed but would not affect other usage of Rand(), so Rand could not produce repeatable pseudo-random data such as for testing and debugging purposes. SQL Server does not vary the result by row within a SELECT but does produce separate values for each reference within the SELECT list (and WHERE clause, etc). Use of Rand(int) in SQL Server would set a new pseudo-random seed for all usage of Rand() by the current connection and would return the first pseudo-random value after the new seed is set. With this correction, each reference to Rand() will return the next pseudo-random value but will then hold that value over all rows of the SELECT, and use of Rand(int) will correctly re-seed the current connection. Note that exact random sequence from a given seed value may not match SQL Server.
Visual Studio & Entity Framework Support
- Entity Framework 6 Support: A new provider for VistaDB has been written that supports the updated EF 6 provider model. It is available for .NET 4.0 and higher.
- Visual Studio 2013 Designer Support: The existing VistaDB designer support for Visual Studio 2010 and 2012 has been extended into 2013.
New FIPS-Compliant Encryption Support
The previous database encryption which used a managed implementation of Blowfish has been replaced with the FIPS-compliant AES implementation built into the .NET runtime. When an encrypted database is upgraded it will be converted from Blowfish to AES. For .NET 4.0 and later the FIPS-compliant unmanaged implementation built into the .NET runtime (which leverages the Microsoft Crypto API underneath) is used. For .NET 2.0 the managed implementation is used since the FIPS-complaint implementation isn't available.
The new encryption support is now database-wide instead of requiring individual columns to enable encryption. This protects the entire database including schema. This mode is also faster than the previous implementation.
New Diagnostic Logging Support
The VistaDB provider now exposes a Logging facility that lets you capture information about each SQL batch and other important operations. You can implement your own logger or use the Loupe Agent for VistaDB for a great experience with no coding required.
Breaking Changes In This Release
Maintaining compatibility with SQL Server and with previous releases of VistaDB is a high priority for the development team. Like SQL Server, each major release is an opportunity to address longstanding issues even when they can't be done in a backwards compatible fashion. The following items are changes to how VistaDB works that may affect existing code being brought forward:
- Bit comparisons: SQL Server does not allow non-compared references to columns (even to BIT columns) as a boolean condition in a WHERE or ON clause; they must be compared to 0 or 1 to create a boolean condition VistaDB now enforces this rule which means it's necessary to explicitly add an = 1 to the SQL Query to make a valid query.
- Binary/VarBinary/VarBinary(Max) now match SQL Server behavior: previously they ignored the length you provided and were always 64kb in size (and couldn’t be used in an index). Now they require and respect the length provided and *can* be used in an index. Existing Binary & VarBinary columns are changed to the equivalent (Max) column since no length specification was available.
- File Format Change: To support new data types and encryption the file format was modified. This means databases upgrading from 4.x to 5.x will need to be packed to upgrade. That will rewrite the file in 5.0 format. The conversion is one way. The same was historically true for someone coming from 3.x to 4.x; so now users can go from 3.x or 4.x to 5.0 by packing the source database.
- File Name Change: All of the VistaDB assemblies have been renamed to allow them to be side-by-side with older versions and to provide a .NET 4 (and later) version.
- Sp_Rename now requires SQL Server calling syntax: Calling sp_Rename with () around the arguments is no longer supported, matching TSQL.
- IDENTITY columns are now left out of implied lists of columns for INSERTs: matching SQL Server behavior, when selecting * from a table as part of a VALUES section of an INSERT identity columns are ignored since they can't be inserted.
- Column Default values now match SQL Server syntax: Previously extra string quoting ('') was required for default values and they didn't support expressions. Now the values should be quoted using the same rule as Sql server, using single quotes for constants instead of double quotes. While expressions will be parsed for defaults they are not yet supported by the engine.
Visual Studio & Entity Framework Support
- Visual Studio 2008 Integration Not Supported: VistaDB no longer integrates with Visual Studio 2008. This means the DataSet designer and other operations performed in the Visual Studio UI are no longer supported.
- Entity Framework 1.0 Not Supported: VistaDB 5 only supports Entity Framework in .NET 4.0 and higher. EF 1.0 (which only shipped under .NET 3.5) is no longer supported. Customers are strongly encouraged to update their Entity Framework Projects to EF 5 or EF 6.
Defects Fixed In This Release
Engine
- Query Fix: AND and OR operators can sometimes produce incorrectly-constant results in execution within a WHILE loop when one of their operands is a constant NULL (UNKNOWN) value.
- Query Fix: HAVING clause results in a null-reference exception if evaluation produces a NULL/UNKNOWN final result.
- Query Fix: May fail to apply WHERE clause for some joins of subqueries, particularly with CROSS JOIN or with a trivially-true ON clause.
- Query Fix: Some queries with ORDER BY TableName.ColumnName can fail to sort when used with GROUP BY.
- Query Fix: Rows containing NULL values for any later column of a multi-column index can be incorrectly included or excluded in a SELECT when the WHERE clause (or an ON clause) includes a comparison on the first column of that multi-column index.
- Query Fix: Use of IS NOT NULL in a WHERE clause (or ON clause) can be handled incorrectly when combined with other comparisons and then inverted (NOT).
- Query Fix: SELECT with no FROM and with a non-trivial WHERE clause (for example, parameter-dependent) incorrectly returns non-empty results when WHERE clause is not true.
- Query Fix: OUTER JOINs with WHERE Indexed Column IS NULL may return incorrect results.
- Query Fix: Implementation of Rand() is inconsistent with SQL Server.
- Query Fix: BETWEEN operator evaluates as false instead of null when any input is NULL, so NOT BETWEEN of same inputs incorrectly passes the condition.
- Update Fix: UPDATE statement can fail with an index-out-of-bounds exception if FROM rowset is given but does not include the target table.
- Stored Procedure Fix: EXEC procedure calls in queries can produce errors when no semicolon separates it from a following statement.
- Stored Procedure Fix: EXEC stored procedure calls can incorrectly take values from variables of the same name in the calling scope instead of getting a declared default (or NULL) when an argument is omitted.
- User-Defined Function Fix: Scalar value User-defined functions (UDF) with a table variable fail at runtime.
- Parser Fix: Execution of STORED PROCEDURE with parenthesis around declared parameters can fail due to parsing error.
- Parser Fix: SELECT incorrectly attempts to parse column names that match function names as a function call instead of first treating them as a column name.
- Parser Fix: Incorrectly parses select list entries of expressions which start with a parameter (e.g. @param1 + val).
- Parser Fix: Does not correctly parse certain 3-part column names like: [dbo].[MyTable].ColumnName
- Parser Fix: SELECT improperly accepts other unquoted keywords as table or column alias and thus can fail to recognize the start of a new statement when an optional alias can follow if no semicolon is used to separate them.
- Parser Fix: INSERT INTO … SELECT can fail to parse the next statement after the SELECT query when there are no parentheses around the SELECT query and no semicolon separating it from a following statement.
- Parser Fix: Unary minus after multiplicative operators (*, /, %) results in an error (e.g. 5 * -3).
- Parser Fix: A bare RETURN in a stored procedure or function with no semicolon can fail to parse.
- Performance Fix: Performance of JOINs on multi-column keys can be slower in 4.3 than in 4.2.
- Schema Fix: Using the same table after adding a foreign key can sometimes fail depending on memory utilization and database schema size.
Data Builder
- Fixed: An error parsing a stored procedure or user-defined function can cause the database to fail to open. Now these are indicated with a warning graphic in the tree view.
Data Migration Wizard
- Fixed: Setting an encryption password in wizard when migrating from non-VistaDB databases leaves database data unencrypted.
- Fixed: Setting and then erasing the encryption password field in wizard form can cause migration to fail in creating a new database.
- Fixed: AutoIncrement columns in Access databases can fail to migrate due to an unexpected syntax error.
Visual Studio Designer
- Fixed: Entity Framework Designer is unacceptably slow (taking many minutes) for basic schemas under EF 5 and EF 6.
- Fixed: Entity Framework Designer fails if any table uses a column named the same as a SQL function.
Known Issues
Compared to VistaDB 4, there is the following issue still open:
- Full Text Search indexes can't be optimized with other indexes: The new optimization system doesn't consider FTS indexes when optimizing the query so FTS queries will not scale well for tables of size.