Ir al contenido principal

BI - SSIS ( Basics I )

SSIS - Basic Concepts

Tasks

  • Bulk Insert Task—Loads data into a table by using the BULK INSERT SQL command.
  • Data Flow Task—This is the most important task that loads and transforms data into an OLE DB Destination.
  • Execute Package Task—Enables you to execute a package from within a package, making your SSIS packages modular.
  • Execute Process Task—Executes a program external to your package, like one to split your extract file into many files before processing the individual files.
  • Execute SQL Task—Executes a SQL statement or stored procedure.
  • File System Task—This task can handle directory operations like creating, renaming, or deleting a directory. It can also manage file operations like moving, copying, or deleting files.
  • FTP Task—Sends or receives files from an FTP site.
  • Script Task—Runs a set of VB.NET or C# coding inside a Visual Studio environment.
  • Send Mail Task—Sends a mail message through SMTP.
  • Analysis Services Processing Task—This task processes a SQL Server Analysis Services cube, dimension, or mining model.
  • Web Service Task—Executes a method on a web service.
  • WMI Data Reader Task—This task can run WQL queries against the Windows Management Instrumentation (WMI). This enables you to read the event log, get a list of applications that are installed, or determine hardware that is installed, to name a few examples.
  • WMI Event Watcher Task—This task empowers SSIS to wait for and respond to certain WMI events that occur in the operating system.
  • XML Task—Parses or processes an XML file. It can merge, split, or reformat an XML file.

Transformations

  • Aggregate—Aggregates data from a transform or source similar to a GROUP BY statement in T-SQL.
  • Conditional Split—Splits the data based on certain conditions being met. For example, if the State column is equal to Florida, send the data down a different path. This transform is similar to a CASE statement in T-SQL.
  • Data Conversion—Converts a column’s data type to another data type. This transform is similar to a CAST statement in T-SQL.
  • Derived Column—Performs an in-line update to the data or creates a new column from a formula. For example, you can use this to calculate a Profit column based on a Cost and SellPrice set of columns.
  • Fuzzy Grouping—Performs data cleansing by finding rows that are likely duplicates.
  • Fuzzy Lookup—Matches and standardizes data based on fuzzy logic. For example, this can transform the name Jon to John.
  • Lookup—Performs a lookup on data to be used later in a transformation. For example, you
  • can use this transformation to look up a city based on the ZIP code.
  • Multicast—Sends a copy of the data to an additional path in the workflow and can be used to parallelize data. For example, you may want to send the same set of records to two tables.
  • OLE DB Command—Executes an OLE DB command for each row in the Data Flow. Can be used to run an UPDATE or DELETE statement inside the Data Flow.
  • Row Count—Stores the row count from the Data Flow into a variable for later use by, perhaps, an auditing solution.
  • Script Component—Uses a script to transform the data. For example, you can use this to apply specialized business logic to your Data Flow.
  • Slowly Changing Dimension—Coordinates the conditional insert or update of data in a slowly changing dimension during a data warehouse load.
  • Sort—Sorts the data in the Data Flow by a given column and removes exact duplicates.
  • Union All—Merges multiple data sets into a single data set.
  • Unpivot—Unpivots the data from a non-normalized format to a relational format.

Project Structure

  • .dtsx—An SSIS package
  • .ds—A shared data source file
  • .sln—A solution file that contains one or more projects
  • .dtproj—An SSIS project file
  • .params—A shared parameter file
  • .conmgr—A shared connection manager

Control Flow / Data Flow

Control Flow tab controls the execution of the package 
Data Flow tab handles the movement of data

Precedence Constrains 

Are the lines between tasks in the Data Flow
  • Green = On Success
  • Red = On Failure
  • Blue = On Completion
  • Any color with FX Logo = Expression, or Expression with a Constraint
In the Control Flow
  • Create a precedence constraints 
  • You can also specify a condition of type: AND / OR

Comentarios

Entradas populares de este blog

C# Using tabs

To use tabs in C# use the TabContainer element from AjaxControlToolkit Include AjaxControlToolkit  Include in the Web.config file, inside the tag <system.web> the following code  <pages>       <controls>         <add tagPrefix="ajaxCTK" namespace="AjaxControlToolkit" assembly="AjaxControlToolkit"/>       </controls>     </pages>   Include TabContainer element First  include TabContainer element that is the section where all the tabs will be displayed. <ajaxCTK:TabContainer ID="TabContainerUpdate" runat="server"                 Height="800"                 CssClass="ajax__tab_style"> </ajaxCTK:TabContainer> Second per each tab include the following code corresponding to each ...

Rails - Basic Steps III

pValidations Validations are a type of ActiveRecord Validations are defined in our models Implement Validations Go to   root_app/app/models Open files  *.rb for each model Mandatory field validates_presence_of   :field Ex:   validates_presence_of    :title Classes The basic syntax is class MyClass        @global_variable                def my_method              @method_variable        end end Create an instance myInstance = MyClass.new Invoke a mehod mc.my_method class() method returns the type of the object In Ruby, last character of method define the behavior If ends with a question -> return a boolean value If ends with an exclamation -> change the state of the object Getter / Setter method def global_variable       return @global_variable end ...

Python create package

Create a root folder Create a sub-folder "example_pkg" that contains the funtionallity packaging_tutorial/ example_pkg/ __init__.py In the root folder create the following structure  packaging_tutorial/ example_pkg/ __init__.py tests/ setup.py LICENSE README.md in the setup.py contains the configuration of the packages your package is found by find_packages() import setuptools with open ( "README.md" , "r" ) as fh : long_description = fh . read () setuptools . setup ( name = "example-pkg-YOUR-USERNAME-HERE" , # Replace with your own username version = "0.0.1" , author = "Example Author" , author_email = "author@example.com" , description = "A small example package" , long_description = long_description , long_description_content_type = "text/markdown" , url = "https://github.com/pypa/sam...