Sunday, February 19, 2012

excel source with optional columns

Hi:

I use a SSIS package to loop thro a folder and load data from multiple excel files to a SQL2005 table. Works fine except when an excel has a missing col.

Col names in xls are always a subset of col names in the table. The missing cols are random, else I would just have made another package:-)

Once a missing column is found, I get runtime and design time errors, and metadata problems. How can a get SSIS to ignore missing columns?

TIA

I recently solved this problem using a dynamically built select statement. Is it always just 1 column that's missing or do you need to load a dynamic number of columns? If it's a truly dynamic then the algorithm is a little more complex...|||

Thanks for your response. Request you tell me more aboout it.

I did the whole thing in BIDS in a SSIS project, using a ForEach container, a Excel Source and an OleDB destination. I was hoping to achieve my objectives with these objects and their settings :-).

|||I used a For Each Loop and then a For Loop to solve this problem.

The first For Each Loop iterates threw the columns names in the spreadsheet. It contains a script component that counts the columns storing the result in a variable. There might be a more efficient way to count columns but I couldn't figure out how.

The second For Loop container uses this counter variable to select and load each column one at a time. It contains 2 components; a script component that builds a select statement and a data flow task that actually moves the data using the select statement.

Here is the script code that dynamically builds each select statement:

Public Sub Main()
Dim SelectCommand As String
Dim WorksheetName As String
Dim ColumnLoopIndex As Integer
WorksheetName = Dts.Variables("WorksheetName").Value.ToString
ColumnLoopIndex = CInt(Dts.Variables("ColumnLoopIndex").Value)
SelectCommand = "Select F" & ColumnLoopIndex.ToString & " AS CurrentColumn from [" & WorksheetName & "]"
Dts.Variables("SelectCommand").Value = SelectCommand
Dts.TaskResult = Dts.Results.Success
End Sub

Please note: Depending on your data and how dynamic you want the

package to be you could skip the second For Loop and build a single select

statement that loads all of the columns. In this case your dynamically built select

statement would contain return fields like "SELECT F1, F2, NULL AS F3, NULL AS F4

FROM [myworksheetname]" to account for missing F3 and F4 columns.

No comments:

Post a Comment