This website requires JavaScript.

[译] SSIS: 排序文件,使Foreach Loop 组件按照顺序处理文件

需求

我需要文件按照时间顺序处理,但是foreach loop 没有排序的机制,有没有什么方法根据文件日期或者文件名排序?

解决方案

标准的foreach loop 容器是无法排序的, 微软的Feedback也有人反馈. 这玩意看上去默认是按照文件名排序的(不靠谱的样子). 你可以使用自定义的组件 Sorted File Enumerator  或者使用Script Task 实现这个功能. 这个脚本任务解决方案让你可以根据文件名,创年日期,最后修改日期或者其他文件属性来顺序或倒序排序.

0x01 Control Flow

托一个Script Task 和 Foreach Loop Container 到Control Flow,如下图

image

0x02 Variables

创建三个变量

  1. startlocatin(string) 用来指定排序文件所在的目录, 填入一个路径类似 “C:\BIHELL”
  2. dataset(object) 用来在Script Task 与 Foreach Loop之间’通讯’
  3. filepath(string) 用来映射 Foreach Loop 获得的的文件 image

0x03 Script Task

选择startlocation 作为只读变量(我们只用来读路径),然后选择dataset作为 readwritevariable(到时候会有一个ADO对象放在里面)

image

0x04 The Script

把下面的脚本拷贝到你的Script Task.  第52行用来设置过滤和排序,有点像SQL查询, 更多的样例代码可以参考msdn , 注意脚本里面记得添加 System.IO的引用 (注意,作者给的C#的代码有问题,如果报错,可以用VB.Net的)

// C# code using System; using System.Data; using System.IO;    // Added using Microsoft.SqlServer.Dts.Runtime; using System.Windows.Forms;

namespace ST_2e776e26793b45939128add3d850f70d.csproj {      [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]      public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase      {

         #region VSTA generated code          enum ScriptResults          {              Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,              Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure          };          #endregion

         public void Main()          {              // Create a dataset. I named it unsorted, because it's not yet sorted              DataSet dsUnsorted = new DataSet();

             // Create a new table in the dataset              DataTable filelistTable = dsUnsorted.Tables.Add();              filelistTable.Columns.Add("FilePath", typeof(string));  // Filepath needed for connectionstring.              filelistTable.Columns.Add("FileName", typeof(string));  // Filename used for sorting [optional].              filelistTable.Columns.Add("FileDate", typeof(DateTime));// Filedate used for sorting [optional].

             // Get all files within the folder                    string[] allFiles = Directory.GetFiles(Dts.Variables["User::startlocation"].Value.ToString());

             // Variable for storing file properties              FileInfo fileInfo;

             // Loop through the files in the folder                    foreach (string currentFile in allFiles)              {                  // Fill fileInfo variable with file information                          fileInfo = new FileInfo(currentFile);

                 // Choose which the file properties you will use                  // Columns:            FilePath           FileName       FileDate                  filelistTable.Rows.Add(fileInfo.FullName, fileInfo.Name, fileInfo.CreationTime);              }

             // Filtering on .txt extension. Note: like uses * instead of %              // Sorting the files on filename (or filedate: FileName DESC)              DataRow[] rows = dsUnsorted.Tables[0].Select("FileName like '.txt'", "FileDate ASC");

             // Create a new sorted dataset that the SSIS foreach loop uses.              DataSet dsSorted = new DataSet();              DataTable filelistTableSorted = dsSorted.Tables.Add();

             // Only interested in the filepath which is needed for the connectionstring              filelistTableSorted.Columns.Add("FilePath", typeof(string));

             // Fill the new dataset with the sorted rows.              foreach (DataRow row in rows)              {                  filelistTableSorted.Rows.Add(row["FilePath"].ToString());              }

             // Store the dataset in the SSIS variable                    Dts.Variables["dataset"].Value = dsSorted;

             Dts.TaskResult = (int)ScriptResults.Success;          }      } }

或VB.net
' VB.Net Code Imports System Imports System.Data Imports System.IO       ' Added Imports System.Math Imports Microsoft.SqlServer.Dts.Runtime

<System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _ <System.CLSCompliantAttribute(False)> _ Partial Public Class ScriptMain   Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

  Enum ScriptResults    Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success    Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure   End Enum

     Public Sub Main()          ' Create a dataset. I named it unsorted, because it's not yet sorted          Dim dsUnsorted As New DataSet()

         ' Create a new table in the dataset          Dim filelistTable As DataTable = dsUnsorted.Tables.Add()          filelistTable.Columns.Add("FilePath", GetType(String))   ' Filepath needed for connectionstring.          filelistTable.Columns.Add("FileName", GetType(String))   ' Filename used for sorting [optional].          filelistTable.Columns.Add("FileDate", GetType(DateTime)) ' Filedate used for sorting [optional].

         ' Get all files within the folder                Dim allFiles As String() = Directory.GetFiles(Dts.Variables("User::startlocation").Value.ToString())

         ' Variable for storing file properties          Dim fileInfo As FileInfo

         ' Loop through the files in the folder                For Each currentFile As String In allFiles              ' Fill fileInfo variable with file information                      fileInfo = New FileInfo(currentFile)

             ' Choose which the file properties you will use              ' Columns:            FilePath           FileName       FileDate              filelistTable.Rows.Add(fileInfo.FullName, fileInfo.Name, fileInfo.CreationTime)          Next

         ' Filtering on .txt extension. Note: like uses * instead of %          ' Sorting the files on filename (or filedate: FileName DESC)          Dim rows As DataRow() = dsUnsorted.Tables(0).[Select]("FileName like '.txt'", "FileDate ASC")

         ' Create a new sorted dataset that the SSIS foreach loop uses.          Dim dsSorted As New DataSet()          Dim filelistTableSorted As DataTable = dsSorted.Tables.Add()

         ' Only interested in the filepath which is needed for the connectionstring          filelistTableSorted.Columns.Add("FilePath", GetType(String))

         ' Fill the new dataset with the sorted rows.          For Each row As DataRow In rows              filelistTableSorted.Rows.Add(row("FilePath").ToString())          Next

         ' Store the dataset in the SSIS variable                Dts.Variables("dataset").Value = dsSorted          Dts.TaskResult = ScriptResults.Success      End Sub

End Class

 

0x05 Foreach Loop Container

编辑Foreach Loop  Container 组件,然后把Collection 标签的enumerator 修改为 Foreach ADO Enumerator. 也就是让loop 使用 ADO 对象. 选择dataset作为ADO object的输入源. 这里面包含了Script Task排序好的文件.

image

0x06 Variable Mappings

把变量filepath 映射到Index 0, 这样就存储了文件的路径.

image

0x07 Test the result

现在可以验证一下结果,拖个Script Task 然后用Message Box 显示 文件路径吧

MessageBox.Show(Dts.Variables["User::filepath"].Value.ToString());
[![image](http://www.bihell.com/wp-content/uploads/2015/10/image_thumb52.png "image")](http://www.bihell.com/wp-content/uploads/2015/10/image52.png)

注意: 当然还有其他方式实现这个,比如把所有文件名放到数据库中,然后在加个Foreach Loop 读取排序后的表格.

样例包 2008 样例包 2012

原文连接:How to configure a Foreach Loop Container: Sorted File Enumerator

 

0条评论
avatar