What is data profiling?

1 comment:

  1. Data profile is an interesting task. if you love data, you would enjoy doing data profiling. As the name suggest data profiling is to profile or analyse the data. Analyse it's organization, type, amount and all other aspects which data can have. There could be various reason behind the profiling. Following are just to name a few:

    1) To check that the data is suitable for other business requirements then one from where the data originated.
    2) In case of migration from one system to another, it is an exercise to analyse the data if it is suitable for the target.
    3) Profiling can result the information about its container, whether the database or file system is suitable enough to hold the information or it needs change.
    4) To categorise the data to use it more effectively to make effective business decision.
    5) Sometime profiling is required to check the quality of the data. Quality check of the data can yield wonderful results and could be beneficial for DSS and other interfaces.

    During the analysis following aspects of the data can be considered:

    Database/Business File
    Business Table/File Name
    Business Table/Element Name
    Element Definition
    Table Name/File Name
    Column Name
    Data Type
    Data Length
    Decimal Place
    Valid Values
    PK
    FK
    NULL Y/N
    Element Required
    Element Mandatory

    Above aspects can be subdivided into logical and physical arrangement of the data. Following are some of the tools used in the industry to profile the data:

    Infomatica Data Explorer
    Taled Open profiler
    Oracle Data Integrator
    Datamartist
    Datiris
    pervasive Data Profiler

    ReplyDelete