SQL Server Integration Services (SSIS) utilize several type systems – some depending on the different data providers supported by the SSIS, some depending on the environment where the service is used, etc. When designing SSIS packages, especially in heterogeneous environments, where different data management platforms and tools are used, appropriate data type mapping is highly critical – just think of quantities, precision and scale. This article should provide you with some of the basics of how data types of the several type systems supported by SSIS can be used when integrating data from diverse source and type systems. The article also mentions some of the type-related shortcomings of SSIS and tries to help you correct work around them.
The SQL Server Integration Services Data Type Map
The following table provides a mapping of all data types used in SQL Server Integration Services; the mapping is represented from the SQL Server perspective. Data types that are, in fact, supported by other type systems represented in the table, but which do not map to a corresponding SQL Server data type, are not shown here.
Obviously, when I say all, I actually mean all, except the type system used by the SQL Server Native Client ODBC provider; I plan to cover that in a later post.
Also note that, rather than providing a data type compatibility matrix (which, to be honest, could have been even more useful), I've instead tried to provide a simplified, one-to-one data type compatibility map, that you can use to safely cast a value using a data type of one type system to the nearest data type in another type system.
Data Type | |||||
---|---|---|---|---|---|
SQL Server | SSIS Variables |
SSIS Pipeline Buffer |
OLE DB | ADO.NET | |
bigint | Int64 | DT_I8 | LARGE_INTEGER | Int64 | |
binary | Object* | DT_BYTES | n/a | Binary | |
bit | Boolean | DT_BOOL | VARIANT_BOOL | Boolean | |
char | String | DT_STR | VARCHAR | StringFixedLength | |
date | Object* | DT_DBDATE | DBDATE | Date | |
datetime | DateTime | DT_DBTIMESTAMP | DATE** | DateTime | |
datetime2 | Object* | DT_DBTIMESTAMP2 | DBTIME2 | DateTime2 | |
datetimeoffset | Object* | DT_DBTIMESTAMPOFFSET | DBTIMESTAMPOFFSET | DateTimeOffset | |
decimal | Object*** (< SQL 2012) Decimal (>= SQL 2012) |
DT_NUMERIC | NUMERIC | Decimal | |
float | Double | DT_R8 | FLOAT | Double | |
image | Object* | DT_IMAGE | n/a | Binary | |
int | Int32 | DT_I4 | LONG | Int32 | |
money | Object* | DT_CY (OLE DB) DT_NUMERIC (ADO.NET) |
CURRENCY | Currency | |
nchar | String | DT_WSTR | NVARCHAR | StringFixedLength | |
ntext | String | DT_NTEXT | n/a | String | |
numeric | Object*** (< SQL 2012) Decimal (>= SQL 2012) |
DT_NUMERIC | NUMERIC | Decimal | |
nvarchar | String | DT_WSTR | NVARCHAR | String | |
nvarchar(max) | Object | DT_NTEXT | n/a | n/a | |
real | Single | DT_R4 | FLOAT, DOUBLE | Single | |
rowversion | Object* | DT_BYTES | n/a | Binary | |
smalldatetime | DateTime | DT_DBTIMESTAMP | DATE** | DateTime | |
smallint | Int16 | DT_I2 | SHORT | Int16 | |
smallmoney | Object* | DT_CY (OLE DB) DT_NUMERIC (ADO.NET) |
CURRENCY | Currency | |
sql_variant | Object* | DT_WSTR**** (OLE DB) DT_NTEXT**** (ADO.NET) |
**** | Object**** | |
table | Object* | n/a | ***** | ||
text | Object* | DT_TEXT | n/a | n/a | |
time | Object* | DT_DBTIME2 | DBTIME2 | Time | |
timestamp | Object* | DT_BYTES | n/a | Binary | |
tinyint | Byte | DT_UI1 | BYTE | Byte | |
uniqueidentifier | String****** (OLE DB) Object****** (ADO.NET) |
DT_GUID | GUID | Guid | |
varbinary | Object* | DT_BYTES | n/a | Binary | |
varbinary(max) | Object* | DT_IMAGE | n/a | Binary | |
varchar | String | DT_STR | VARCHAR | String | |
varchar(max) | Object* | DT_TEXT | n/a | n/a | |
xml | Object* | DT_NTEXT | ***** |
Type Systems
The following type systems are supported by SSIS:
- SQL Server Data Types – these are the essential data types supported by SQL Server; also known as built-in or system data types. Custom, CLR-based data types, are not discussed in this article, mostly because they are specific to SQL Server, specific to a particular version of SQL Server, or could be represented using system data types (e.g. by character or binary data types, or by XML);
- SSIS Variable Data Types – the data types of the underlying type system used by the SSIS service are exposed to the programming environment as .NET Framework data types. However, be aware that not all .NET Framework data types can be used by the SSIS variables. You can find a list of supported SSIS variable data types in the TypeCode Enumeration article on MSDN;
- SSIS Pipeline Buffer Data Types – the data types used by the pipeline buffer, the essential element of the Data Flow task, are exposed using a dedicated type system. This type system is different from the one used by the SSIS variables. As long as the Data Flow definition metadata corresponds to the metadata of the data sources involved in the Data Flow, you should not experience any problems with this particular type system, regardless of the data providers used by the Data Source or the Data Destination components. For additional information on SSIS pipeline buffer data types consult the Integration Services Data Types article on MSDN;
- OLE DB and ADO.NET Data Types – the data providers use their own type systems, which are exposed to the SSIS programming interface. These are also different from the rest of the type systems. When referring to these two data providers in this article, I specifically refer to their use in the Execute SQL Task. You can find more information on OLE DB data types and data type conversions on MSDN, beginning with Data Types in OLE DB (OLE DB). For more information about ADO.NET data types start with Data Type Mappings in ADO.NET.
The Base of All Bases
In the .NET Framework all data types are derived from Object
. Therefore, all .NET data types are convertible to Object
. In other words: wherever .NET data types are used and no other more appropriate data type is available, Object
can be used instead.
SSIS variables utilize a subset of .NET data types, including Object
; if the most appropriate data type is not available for your SSIS variable, use Object
.
Why not use Object
for just any SSIS variable? Think of debugging and logging. While most other data types implement at least one standard accessor that allows you to view or display the actual value, or to write it to a log, with Object
you would have to add your own programmatic logic to convert the values to their actual type before they could be displayed or written to the log.
Exceptional Cases
DATE
Even though OLE DB documentation – for instance, the Data Type Support for OLE DB Date/Time Improvements article on MSDN (also available for SQL Server versions 2008 and 2008 R2) – suggests that the OLE DB DBTIMESTAMP
be used for SQL Server DATETIME
or SMALLDATETIME
values, this data type does not seem to be supported in the Execute SQL Task. Attempts to pass SQL Server DATETIME
or SMALLDATETIME
values to (or from) an OLE DB DBTIMESTAMP
parameter will result in the following error:
Executing the query "..." failed with the following error: "Invalid time format". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.
Alternatively, you can use the OLE DB DATE
data type for DATETIME
or SMALLDATETIME
values. Be warned, though, that the precision of DATE
is one second, which means that DATETIME
values might be truncated. On the other hand, SMALLDATETIME
values, whose precision is one minute, will not be affected.
ADO.NET is a more appropriate option in such cases, as it does not suffer from these limitations.
DECIMAL
Prior to SSIS 2012, DECIMAL
was not a supported SSIS variable data type. Various workarounds have so far been proposed by other users, but to me the only sensible option for SSIS variables holding decimal values is the use of the Object
data type; the values will be implicitly converted to the appropriate destination data type – for instance, when passed to an Execute SQL Task parameter. Starting with SQL Server 2012, DECIMAL
is (finally) available for use with SSIS variables; so, at least this particular workaround is now a thing of the past.
The Decimal
data type is available in OLE DB as well as ADO.NET data providers; however, only ADO.NET actually supports it in SSIS. The following exception is raised by SSIS when trying to use the DECIMAL
data type in an Execute SQL Task using OLE DB (regardless of the fact that DECIMAL
is listed as a supported data type in the Execute SQL Task editor in SSDT or BIDS):
The type is not supported.DBTYPE_DECIMAL
The OLE DB NUMERICAL
data type is available and compatible; its precision and scale match the ones used for SQL Server DECIMAL
(and NUMERIC
) data types.
UNIQUEIDENTIFIER
Even though Guid
is a native .NET data type, it is not part of the SSIS variable type system. You can create UNIQUEIDENTIFIER
or Guid
values using SSIS (e.g. using the Script Task or the Execute SQL Task), but in order to pass them to other SSIS components, you must resort to a "trick". For instance, to create a Guid
value in a Script Task and then assign it to a SSIS variable of type String
, use the following assignment:
Dts.Variables["my_guid_string_variable"].Value = "{" + Guid.NewGuid().ToString() + "}";
A Guid
value cast to String
and formatted this way can then be passed as a parameter to an Execute SQL Task, or to the SSIS Pipeline Buffer (e.g. in a Derived Column data flow component) – mapped to a Guid
parameter, or a DT_GUID
column. It will be implicitly cast to the appropriate type.
When UNIQUEIDENTIFIER
values are returned from the Execute SQL Task, their type depends on the data provider used by the task. Attempts to assign a Guid
value returned from ADO.NET to a String
SSIS variable, will result in the usual type mismatch error:
The type of the value being assigned to variable "User::string_guid_variable" differs from the current variable type. Variables may not change type during execution. Variable types are strict, except for variables of type Object.
Again, you can prevent this by using Object
as the SSIS variable data type.
SQL_VARIANT
It would be wrong to say that the SQL_VARIANT
data type is not supported by the Execute SQL Task. On the other hand, you should be very careful with this; any OLE DB type, which is compatible with a SQL Server data type, which is in turn compatible with the SQL_VARIANT
data type, can be used in the Execute SQL Task. However, having to know the type in advance contradicts the principal purpose of SQL_VARIANT
– its ability to support a variety of data types in one variable, or in the same column. I would advise against using the OLE DB provider with the Execute SQL Task if you have to deal with SQL_VARIANT
.
There is no equivalent data type for SQL_VARIANT
in the type system used by the SSIS Pipeline Buffer. If you're using an OLE DB data source and destination, the Data Flow designer will automatically use the DT_WSTR
type for SQL_VARIANT
columns with the following warning:
The output "OLE DB Source Output" references an external data type that cannot be mapped to a Data Flow task data type. The Data Flow task data type DT_WSTR will be used instead.
Also note that the size of the column in the data flow metadata will be set by the editor based on the sample of rows provided at design time. At run time, you might encounter truncation errors, should the size of the actual data exceed the one set at design time.
If you're using an ADO.NET source, the Data Flow designer will automatically use the DT_NTEXT
data type for SQL_VARIANT
columns with the following warning:
The data type "System.Object" found on column "sql_variant_column" is not supported for the component "ADO NET Source" (201). This column will be converted to DT_NTEXT.
Personally, I would advise against using SQL_VARIANT
columns in data flows at all, unless your own tests conclusively show that the data in your environment is read from the source and written to the destination correctly.
TABLE and XML
Even though the SQL Server TABLE
type is supported by the .NET Framework, OLE DB, and ADO.NET – that is, via Table-valued Parameters – it is not supported as a data type in SSIS. Typically, in SSIS, in-flight set-oriented processing is performed using the Data Flow task; of course, if you can stage your data, and are not required to perform all data processing in a single data flow, you can move some of that set-oriented logic outside the SSIS process (for instance, updates or merges can be performed using the UPDATE
and/or MERGE
statements, executed from SSIS by using the Execute SQL Task).
XML
is a complex native data type in SQL Server, and it can also be represented by a string. Unfortunately, in SSIS, things are not quite as simple as that. The .NET Framework implements several types based on the W3C XML Recommendation, ADO.NET also supports XML natively, and OLE DB supports the use of XML data – to some extent. I believe XML deserves special attention in SSIS, and I've covered it in more detail in my earlier post on SSIS and complex parameters.
Large Object Data
Large-object data types, such as VARCHAR(MAX)
, NVARCHAR(MAX)
OR VARBINARY(MAX)
, are not fully supported by the Execute SQL Task, by neither the OLE DB, nor the ADO.NET, providers. By using the ADO.NET provider it is possible to pass IMAGE
, NTEXT
or VARBINARY(MAX)
data to and from the Execute SQL Task, but neither VARCHAR(MAX)
, nor NVARCHAR(MAX)
, parameters are supported. It is, however, possible to work around these limitations by using more elaborate techniques, which I feel deserve a dedicated article, so keep watching this blog for more information on that particular subject.
Additional Notes
I have been meaning to write this article for years; mainly because I needed it (and still do), but also because I couldn't find a single online resource that would cover, in one place, all of the various type systems used in SSIS.
In the beginning, simply locating appropriate documentation quickly proved to be a daunting task; I've also tried to utilize online resources as much as possible – because, after all, this article too will be available online. I also wanted to test each combination, as life with IT has taught me not to take any statement, even if it's part of vendor documentation, for granted. Therefore, I created an SSIS solution in versions 2008 and 2012, which I have used to verify the compatibility of the different type systems. I've tried to be as careful as possible in my experimentation, making sure that I detected as many problems as possible before making any conclusions; however, I still feel there could be things that I might have overlooked.
Therefore, dear reader, if you happen to find any flaws in this (and any other of my texts dealing with data types), please, let me know.
ML