[译] SSIS: 排序文件,使Foreach Loop 组件按照顺序处理文件
默认分类
2020-01-15
498
0
需求
我需要文件按照时间顺序处理,但是foreach loop 没有排序的机制,有没有什么方法根据文件日期或者文件名排序?
解决方案
标准的foreach loop 容器是无法排序的, 微软的Feedback也有人反馈. 这玩意看上去默认是按照文件名排序的(不靠谱的样子). 你可以使用自定义的组件 Sorted File Enumerator 或者使用Script Task 实现这个功能. 这个脚本任务解决方案让你可以根据文件名,创年日期,最后修改日期或者其他文件属性来顺序或倒序排序.
0x01 Control Flow
托一个Script Task 和 Foreach Loop Container 到Control Flow,如下图
0x02 Variables
创建三个变量
- startlocatin(string) 用来指定排序文件所在的目录, 填入一个路径类似 “C:\BIHELL”
- dataset(object) 用来在Script Task 与 Foreach Loop之间’通讯’
- filepath(string) 用来映射 Foreach Loop 获得的的文件
0x03 Script Task
选择startlocation 作为只读变量(我们只用来读路径),然后选择dataset作为 readwritevariable(到时候会有一个ADO对象放在里面)
0x04 The Script
把下面的脚本拷贝到你的Script Task. 第52行用来设置过滤和排序,有点像SQL查询, 更多的样例代码可以参考msdn , 注意脚本里面记得添加 System.IO的引用 (注意,作者给的C#的代码有问题,如果报错,可以用VB.Net的)
// C# code using System; using System.Data; using System.IO; // Added using Microsoft.SqlServer.Dts.Runtime; using System.Windows.Forms;或VB.netnamespace ST_2e776e26793b45939128add3d850f70d.csproj { [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")] public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase {
#region VSTA generated code enum ScriptResults { Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success, Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure }; #endregion
public void Main() { // Create a dataset. I named it unsorted, because it's not yet sorted DataSet dsUnsorted = new DataSet();
// Create a new table in the dataset DataTable filelistTable = dsUnsorted.Tables.Add(); filelistTable.Columns.Add("FilePath", typeof(string)); // Filepath needed for connectionstring. filelistTable.Columns.Add("FileName", typeof(string)); // Filename used for sorting [optional]. filelistTable.Columns.Add("FileDate", typeof(DateTime));// Filedate used for sorting [optional].
// Get all files within the folder string[] allFiles = Directory.GetFiles(Dts.Variables["User::startlocation"].Value.ToString());
// Variable for storing file properties FileInfo fileInfo;
// Loop through the files in the folder foreach (string currentFile in allFiles) { // Fill fileInfo variable with file information fileInfo = new FileInfo(currentFile);
// Choose which the file properties you will use // Columns: FilePath FileName FileDate filelistTable.Rows.Add(fileInfo.FullName, fileInfo.Name, fileInfo.CreationTime); }
// Filtering on .txt extension. Note: like uses * instead of % // Sorting the files on filename (or filedate: FileName DESC) DataRow[] rows = dsUnsorted.Tables[0].Select("FileName like '.txt'", "FileDate ASC");
// Create a new sorted dataset that the SSIS foreach loop uses. DataSet dsSorted = new DataSet(); DataTable filelistTableSorted = dsSorted.Tables.Add();
// Only interested in the filepath which is needed for the connectionstring filelistTableSorted.Columns.Add("FilePath", typeof(string));
// Fill the new dataset with the sorted rows. foreach (DataRow row in rows) { filelistTableSorted.Rows.Add(row["FilePath"].ToString()); }
// Store the dataset in the SSIS variable Dts.Variables["dataset"].Value = dsSorted;
Dts.TaskResult = (int)ScriptResults.Success; } } }
' VB.Net Code Imports System Imports System.Data Imports System.IO ' Added Imports System.Math Imports Microsoft.SqlServer.Dts.Runtime<System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _ <System.CLSCompliantAttribute(False)> _ Partial Public Class ScriptMain Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
Enum ScriptResults Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure End Enum
Public Sub Main() ' Create a dataset. I named it unsorted, because it's not yet sorted Dim dsUnsorted As New DataSet()
' Create a new table in the dataset Dim filelistTable As DataTable = dsUnsorted.Tables.Add() filelistTable.Columns.Add("FilePath", GetType(String)) ' Filepath needed for connectionstring. filelistTable.Columns.Add("FileName", GetType(String)) ' Filename used for sorting [optional]. filelistTable.Columns.Add("FileDate", GetType(DateTime)) ' Filedate used for sorting [optional].
' Get all files within the folder Dim allFiles As String() = Directory.GetFiles(Dts.Variables("User::startlocation").Value.ToString())
' Variable for storing file properties Dim fileInfo As FileInfo
' Loop through the files in the folder For Each currentFile As String In allFiles ' Fill fileInfo variable with file information fileInfo = New FileInfo(currentFile)
' Choose which the file properties you will use ' Columns: FilePath FileName FileDate filelistTable.Rows.Add(fileInfo.FullName, fileInfo.Name, fileInfo.CreationTime) Next
' Filtering on .txt extension. Note: like uses * instead of % ' Sorting the files on filename (or filedate: FileName DESC) Dim rows As DataRow() = dsUnsorted.Tables(0).[Select]("FileName like '.txt'", "FileDate ASC")
' Create a new sorted dataset that the SSIS foreach loop uses. Dim dsSorted As New DataSet() Dim filelistTableSorted As DataTable = dsSorted.Tables.Add()
' Only interested in the filepath which is needed for the connectionstring filelistTableSorted.Columns.Add("FilePath", GetType(String))
' Fill the new dataset with the sorted rows. For Each row As DataRow In rows filelistTableSorted.Rows.Add(row("FilePath").ToString()) Next
' Store the dataset in the SSIS variable Dts.Variables("dataset").Value = dsSorted Dts.TaskResult = ScriptResults.Success End Sub
End Class
0x05 Foreach Loop Container
编辑Foreach Loop Container 组件,然后把Collection 标签的enumerator 修改为 Foreach ADO Enumerator. 也就是让loop 使用 ADO 对象. 选择dataset作为ADO object的输入源. 这里面包含了Script Task排序好的文件.
0x06 Variable Mappings
把变量filepath 映射到Index 0, 这样就存储了文件的路径.
0x07 Test the result
现在可以验证一下结果,拖个Script Task 然后用Message Box 显示 文件路径吧
MessageBox.Show(Dts.Variables["User::filepath"].Value.ToString());[](http://www.bihell.com/wp-content/uploads/2015/10/image52.png)
注意: 当然还有其他方式实现这个,比如把所有文件名放到数据库中,然后在加个Foreach Loop 读取排序后的表格.
原文连接:How to configure a Foreach Loop Container: Sorted File Enumerator