Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Yes, one possible option is to make use of Script Task to process only the Worksheets of your preference.</p> <p>Following example was created using <em>SSIS 2008 R2</em> and <em>Excel 2010</em>. The working folder for this example is <code>C:\Temp\</code>. I think the logic should still hold good for previous versions.</p> <p>In the folder path <code>C:\Temp\</code>, create an Excel 2007 spreadsheet file named <code>Country_States.xlsx</code> with three worksheets namely <code>US_1</code>, <code>US_2</code> and <code>Canada_1</code>.</p> <p><code>US_1</code> of <em>Country_States.xlsx</em> contained the following data</p> <p><img src="https://i.stack.imgur.com/r6wih.png" alt="States_1_Sheet_1"></p> <p><code>US_2</code> of <em>Country_States.xlsx</em> contained the following data</p> <p><img src="https://i.stack.imgur.com/zAJ4Z.png" alt="States_1_Sheet_2"></p> <p><code>Canada_1</code> of <em>Country_States.xlsx</em> contained the following data</p> <p><img src="https://i.stack.imgur.com/AvXVO.png" alt="[States_2_Sheet_1"></p> <p>Create a table in SQL Server named dbo.Destination using the below create script. Excel sheet data will be inserted into this table.</p> <pre><code>CREATE TABLE [dbo].[Destination]( [Id] [int] IDENTITY(1,1) NOT NULL, [State] [nvarchar](255) NULL, [Country] [nvarchar](255) NULL, [FilePath] [nvarchar](255) NULL, [SheetName] [nvarchar](255) NULL, CONSTRAINT [PK_Destination] PRIMARY KEY CLUSTERED ([Id] ASC)) ON [PRIMARY] GO </code></pre> <p>The table is currently empty.</p> <p><img src="https://i.stack.imgur.com/aXu6B.png" alt="Empty table"></p> <p>Create a new SSIS package and on the package, create the following 6 variables. <strong>FolderPath</strong> will contain the folder where the Excel files are stored. <strong>FilePattern</strong> will contain the extension of the files that will be looped through and this example works only for <code>.xlsx</code>. <strong>FilePath</strong> will be assigned with a value by the Foreach Loop container but we need a valid path to begin with for design time and it is currently populated with the path <code>C:\temp\Country_States.xlsx</code> of the first Excel file. <strong>SheetName</strong> will contain the actual sheet name but we need to populate with initial value <code>US_1$</code> to avoid design time error. <code>ProcessTheSheet</code> will contain true/false and set with the default value of false. <code>PatternToMatch</code> will contain the pattern that will use to match only the worksheets of our choice.</p> <p><img src="https://i.stack.imgur.com/JOJlJ.png" alt="Variables"></p> <p>In the package's connection manager, create an ADO.NET connection with the following configuration and name it as <strong>ExcelSchema</strong>.</p> <p>Select the provider <code>Microsoft Office 12.0 Access Database Engine OLE DB Provider</code> under .Net Providers for OleDb. Provide the file path <code>C:\temp\Country_States.xlsx</code></p> <p><img src="https://i.stack.imgur.com/D8OGb.png" alt="ExcelSchema 1"></p> <p>Click on the <code>All</code> section on the left side and set the property Extended Properties to <code>Excel 12.0</code> to denote the version of Excel. Here in this case 12.0 denotes <code>Excel 2007 or above</code>. Click on the Test Connection to make sure that the connection succeeds.</p> <p><img src="https://i.stack.imgur.com/q69xl.png" alt="ExcelSchema 2"></p> <p>Create an Excel connection manager named Excel as shown below.</p> <p><img src="https://i.stack.imgur.com/stS6t.png" alt="Excel"></p> <p>Create an OLE DB Connection SQL Server named <code>SQLServer</code>. So, we should have three connections on the package as shown below.</p> <p><img src="https://i.stack.imgur.com/Y9im1.png" alt="Connections"></p> <p>We need to do the following connection string changes so that the Excel file is dynamically changed as the files are looped through.</p> <p>On the connection <strong>ExcelSchema</strong>, configure the expression <code>ServerName</code> to use the variable <code>FilePath</code>. Click on the ellipsis button to configure the expression.</p> <p><img src="https://i.stack.imgur.com/pmj9E.png" alt="ExcelSchema ServerName"></p> <p>Similarly on the connection <strong>Excel</strong>, configure the expression <code>ServerName</code> to use the variable <code>FilePath</code>. Click on the ellipsis button to configure the expression. </p> <p><img src="https://i.stack.imgur.com/oJIft.png" alt="Excel ServerName"></p> <p>On the Control Flow, place two Foreach Loop containers one within the other. The first <code>Foreach Loop container</code> named Loop files will loop through the files. The second <code>Foreach Loop container</code> will through the sheets within the container. Within the inner For each loop container, place a Script Task that will validate only the sheets that should be processed and a Data Flow Task that will read the Excel files and load data into SQL</p> <p><img src="https://i.stack.imgur.com/5dGX1.png" alt="Control Flow"></p> <p>Configure the first Foreach loop container named <strong>Loop files</strong> as shown below:</p> <p><img src="https://i.stack.imgur.com/cW7F7.png" alt="Foreach Loop 1 Collection"></p> <p><img src="https://i.stack.imgur.com/rlyUE.png" alt="Foreach Loop 1 Variable Mappings"></p> <p>Configure the first Foreach loop container named <strong>Loop sheets</strong> as shown below:</p> <p><img src="https://i.stack.imgur.com/bVnbR.png" alt="Foreach Loop 2 Collection"></p> <p><img src="https://i.stack.imgur.com/RZDbK.png" alt="Foreach Loop 2 Variable Mappings"></p> <p>Script Task should be configured with following code that verifies the <code>SheetName</code> value against the pattern stored in <code>PatternToMatch</code> variable. If the SheetName matches the pattern, then the variable ProcessTheSheet is set to True or else it will be set to False.</p> <p><strong>C# code for SSIS 2008 and above</strong></p> <p>Include the statement <code>using System.Text.RegularExpressions;</code> for RegEx.</p> <pre><code>public void Main() { Variables varCollection = null; Dts.VariableDispenser.LockForRead("User::SheetName"); Dts.VariableDispenser.LockForRead("User::PatternToMatch"); Dts.VariableDispenser.LockForWrite("User::ProcessTheSheet"); Dts.VariableDispenser.GetVariables(ref varCollection); string sheetName = varCollection["User::SheetName"].Value.ToString(); string pattern = varCollection["User::PatternToMatch"].Value.ToString(); Regex rgx = new Regex(pattern, RegexOptions.IgnoreCase); Match match = Regex.Match(sheetName, pattern); varCollection["User::ProcessTheSheet"].Value = match.Success; Dts.TaskResult = (int)ScriptResults.Success; } </code></pre> <p>Right the connector that joins the Script Task and the Data Flow Task and select Edit option. It should bring the Precedence Constraint Editor dialog. Set the Evaluation operation to Expression and set the Expression to <code>@[User::ProcessTheSheet]</code> This expression will allow the package to continue to Data Flow Task only if the Sheetname matches the pattern provided in the variable PatternToMatch. You can notice that the connector contains fx, which means there is an expression in place and also the color changes from Green to Blue.</p> <p><img src="https://i.stack.imgur.com/dqOuh.png" alt="Precedence Constraint"></p> <p>Inside the data flow task, place an Excel Source, Derived Column and OLE DB Destination as shown below:</p> <p><img src="https://i.stack.imgur.com/6FOcW.png" alt="Data Flow Task"></p> <p>Configure the Excel Source to read the appropriate Excel file and the sheet that is currently being looped through.</p> <p><img src="https://i.stack.imgur.com/Q1Ona.png" alt="Excel Source Connection Manager"></p> <p><img src="https://i.stack.imgur.com/3SEHc.png" alt="Excel Source Columns"></p> <p>Configure the derived column to create new columns for file name and sheet name. This is just to demonstrate this example but has no significance.</p> <p><img src="https://i.stack.imgur.com/eCKih.png" alt="Derived column"></p> <p>Configure the OLE DB destination to insert the data into the SQL table.</p> <p><img src="https://i.stack.imgur.com/ZUo6x.png" alt="OLE DB Destination Connection Manager"></p> <p><img src="https://i.stack.imgur.com/NprFQ.png" alt="OLE DB Destination Columns"></p> <p>Below screenshot shows successful execution of the package.</p> <p><img src="https://i.stack.imgur.com/K6PPc.png" alt="Execution successful"></p> <p>When the <code>PatternToMatch</code> variable is set to the value <strong>CA*</strong>, the table is populated only with the values from the Sheet Canada_1.</p> <p><img src="https://i.stack.imgur.com/SxqDJ.png" alt="CA_Pattern"></p> <p>Deleted all the rows from the table. Changed the <code>PatternToMatch</code> variable value to <strong>US*</strong>, the table is populated only with the values from the sheets US_1 and US_2.</p> <p><img src="https://i.stack.imgur.com/oIhQ8.png" alt="US_Pattern"></p> <p>Hope that helps.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload