Configuring Wireless

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Tuesday, August 13, 2013

PL/SQL: Dynamically Building Your Data Archive

Posted on 7:53 AM by Unknown
The purpose of this blog post is just to outline a design I put together as part of an internal project for dynamically building a data archive using rules based on the source data being fed into the system. It's far from complete but I think it highlights an interesting way of building an archive for your data when you don't know when you're doing the designing exactly what data you will be putting into it.

THE PROBLEM
At the moment in order to put data from various sources into the data archive a multitude of different loading programs are used (SSIS, command-line applications, scripts, etc) each of which uses it's own rules to determine where the source data ends up (largely dependent on what rules the developer used when putting it together) and inter-dependencies are largely invisible.

New feeds are added at a rate of one every other month and the system should cope with this wile keeping track of the dependencies in the database.

DESIGNING THE SOLUTION
In essence the problem this solution is trying to solve is to provide a single point of entry into the data archive where you can put your source data and which will then be put into the archive using a pre-specified set of rules to determine where the data ends up and what format it's in.

A simple diagram for the system is;
System Diagram
The specific bit that is "in scope" for this work is the "LOAD Process". How data gets into the DATASOURCE tables is really dependent on where the data is coming from, what format it's in, etc and it's practically impossible to write something so universally generic to cover every possible option from a CSV text file to a database link.

The aim of the solution will be to process the data as it arrives but it's possible that it could be adapted to work with data in batches.

THE PROPOSAL
I've created a fairly simple structure using the tables;
  • SOURCEDATATYPE - This holds a unique reference and description for each different data source
  • STAGINGOUTPUT - This table holds the raw data as loaded into the database from the external feed (I went with this name in case it's necessary to use staging tables for the IMPORT process to manipulate the data prior to it being loaded via the LOAD process)
  • ENTITY - This is the name for a table that is being created as part of the LOAD process in the Data Archive.
  • ENTITYDETAIL - This table contains information on how the data from the source table should be manipulated before being moved into the ENTITY table.
Here's a simple data structure;
Database Structure
As you can see it's pretty self explanatory.

Once you've configured the data source type, and entity details then you're ready to start loading data.

In order to load the database I've created a package called DW_LOADDATA. This has two routines;
  • ProcessAll, and
  • ProcessRow (p_rowID ROWID)
Basically "ProcessAll" loops through the unprocessed rows and passes them one at a time to the "processRow" routine.

The process row routine performs the following steps;
  • Get the new record from STAGINGOUTPUT
  • Identify the ENTITY/ENTITYDETAIL for the feed specified in the STAGINGOUTPUT record
  • Check to see if the ENTITY exists - if not create it.
  • Work out the column name, and if that doesn't exist as part of the ENTITY create it
  • Does a value already exist? If so update it (using MERGE), otherwise INSERT the new value
  • Mark the STAGINGOUTPUT record as processed
Sounds simple? Well it's less than 150 lines of code include comments and formatting ...

The key is the information in the ENTITY/ENTITYDETAIL tables. For example let's suppose I'm loading sales data and I want to create an ENTITY called SUPPLIER_SALES_BY_MONTH with separate columns for each month of data.

In the ENTITY table I'd create a simple record with the name of the new ENTITY (bearing in mind the actual name of the table will be prefixed with the Short_Code from the SOURCEDATATYPE table) and then in the ENTITYDETAIL table create the following rows;

INSERT INTO ENTITYDETAIL
SELECT 1, 1, 2,
  '''PERIOD_'' || TO_CHAR(SO.DATE01, ''YYYYMM'')', -- column_name_expression
  'SO.NUMBER01', -- row_unique_expression
  'OLD.VALUE = NVL(OLD.VALUE, 0) + SO.NUMBER04', -- value_expression
  'NUMBER', -- on_create_type
  '0' -- on_create_default
FROM DUAL
UNION SELECT 1, 1, 1,
  '''SUPPLIER_NAME''', -- column_name_expression
  'SO.NUMBER01', -- row_unique_expression
  'OLD.VALUE = SO.TEXT01', -- value_expression
  'VARCHAR2(80)', -- on_create_type
  '0' -- on_create_default
FROM DUAL


I know "INSERT INTO ..." probably isn't the best way to do this but this is only an example!

As you can see the column_name_expression is looking at the SO (STAGINGOUTPUT) table and formatting the first date to YYYYMM - so a value of 13-JAN-2013 will create/ update the column PERIOD_201301.

The value (for the supplier) is being updated to add on the sales for that month.

The second column that's created is the SUPPLIER_NAME - this is simply the name of the supplier. If I run this using some random test data I end up with a table that looks like;
Generated Table
I've created a script which creates the objects and loads some simple test data. It's available here (via Google Drive - DO NOT RUN IT IN AN EXISTING DATABASE SCHEMA UNLESS YOU WANT OBJECTS STARTING WITH SAL_ TO BE DROPPED!). You'll need to have setup a user with default tablespace permissions in order to get the script to work.

Let me know in the comments if you find this useful
Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
Posted in pl/sql | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Oracle PL/SQL: Working With Oracle Jobs (Showing/ Deleting/ Adding)
    Showing All Oracle Jobs To see a list of the currently configured Oracle Jobs use the SQL; SELECT job,        schema_user,        last_date,...
  • Oracle PL/SQL: Copying Column Comments From One View To Another
    This blog post gives a piece of simple SQL that will allow you to automatically copy the comments from one database view to another. In the ...
  • Oracle PL/SQL: Using DBMS_LDAP To Query Active Directory
    It's always useful to be able to retrieve details from Active Directory when working within an Oracle Database (I'll do a separate p...
  • PL/SQL: Using R12 Item Open Interface Tables
    I'm sure we won't be the only company to need to do a batch update of all the existing Items (in INV.MTL_SYSTEM_ITEMS_B) so I though...
  • SSRS: Deployment Problems With Large Data Models
    This blog post describes how to solve the "Maximum request length exceeded" error when deploying large data models; the "maxi...
  • SSRS: Creating a "Shared Reports" folder in Reporting Services
    This blog post covers step-by step instructions on how to create a folder that can be shared between multiple users without being publicly f...
  • Oracle EBS: Creating New Menu Items in Oracle e-Business Suite
    NOTE: Don't do this on a production environment. Did that need saying? Apparently one person who submitted a comment seemed to think so ...
  • Lot Genealogy, Part 3: Debugging Lots and Batches
    As you might have noticed I've just updated the LOT_GENEALOGY source code for this project to a new version following the discovery of a...
  • SSRS: Adding Calculated Fields To Data Sets
    This blog post covers an example of how to add a simple calculated field to a Dataset in SQL Server Reporting Services using Report Builder ...
  • Noetix: Adding a new Z$ Column Reference
    Sometimes you need to add an additional Z$ column to link between two view. This Google Knol tells you what you need to know to make a simpl...

Categories

  • .net framework
  • #Error
  • 1080p
  • 1248ub
  • 2007
  • 2008R2
  • 32-bit
  • 4.1.1
  • 64-bit
  • 720p
  • accellion
  • active directory
  • ad
  • airplay
  • All_Col_Comments
  • All_MViews
  • All_Objects
  • All_Source
  • All_Tab_Columns
  • All_Tables
  • All_Views
  • ALR_Action_Outputs_Pkg
  • ALR_Action_Sets
  • ALR_Actions_Pkg
  • ALR_Alert_Installations_Pkg
  • ALR_Alert_Outputs_Pkg
  • ALR_Alerts_Pkg
  • ALR_DBTrigger
  • amazon wishlist
  • aod
  • AP
  • AP_Credit_Card_Trxns_All
  • AP_Invoices_All
  • AP_Payables
  • AP_Vendor_Sites_Pkg
  • AP_Vendors_Pkg
  • app-v
  • apple
  • apple mac
  • apple maps
  • apple tv
  • application virtualisation
  • AR_Receivables
  • arbury carnival
  • arbury community centre
  • arbury court
  • arbury court library
  • army of darkness
  • army of darkness defense
  • asp.net
  • audiobooks
  • bar hill cambridgeshire uk
  • bar hill library
  • bbc micro
  • bids
  • biztalk 2009
  • british telecom
  • business intelligence development studio
  • business objects
  • c sharp
  • cambridge central library
  • cambridge regional college
  • cambridge station
  • cambridgeshire county council
  • cambridgeshire library service
  • Cast()
  • ccc
  • CDate()
  • citi 1
  • city councillor
  • classic pc
  • cmdb
  • commodore 64
  • Concurren Requests
  • configuration items
  • configuration management database
  • conservative
  • Count()
  • county councillor
  • crc
  • D600
  • data model
  • data source
  • database link
  • dataset
  • DateAdd()
  • DateSerial()
  • dba_jobs
  • DBA_Objects
  • DBA_Tab_Columns
  • dbms_job
  • DBMS_LDAP
  • dbms_refresh
  • dbo.AllUserData
  • dbo.Catalog
  • dbo.ExecutionLogStorage
  • Dell
  • district councillor
  • doodle.com
  • dos box
  • driver
  • e-Business Suite
  • easypush
  • EBS
  • email
  • epetitions
  • excel
  • ExecutionLog2
  • fa
  • FA_Fixed_Assets
  • fixed assets
  • FND_Form_Functions
  • FND_Form_Functions_Pkg
  • FND_Global
  • FND_Menu_Entries
  • FND_Menu_Entries_Pkg
  • FND_Menus
  • FND_Profile_Option_Values
  • FND_Profile_Options
  • FND_Program
  • FND_Request
  • FND_Users
  • FOI
  • Format()
  • freedom of information
  • Functional Administrator
  • GL_Daily_Rates_V
  • GL_Item_Cst
  • GL_Je_Lines
  • GL_Ledger
  • Gmail
  • GMD_Product_Development
  • GME_Process_Execution
  • GMF_OPM_Financials
  • GMF_Period_Balances
  • GMF_SLA_Cost_Subledger
  • gmfg0_item_costs
  • GMI_Onhand_Inv_By_Lot
  • GMI_Process_Planning
  • google
  • google dns
  • google knol
  • google maps
  • green
  • gremlin
  • group policy
  • guided bus
  • high definition
  • home hub 3.0
  • home sharing
  • hr.net
  • i-Expenses
  • ibm
  • iccid
  • iExpenses
  • IIF
  • IIF()
  • iis
  • iis 6
  • imei
  • information
  • installation
  • InStr
  • InStrRev
  • Internet Expenses
  • INV_Forecasts
  • INV_Inventory
  • INV_Item_Onhand_By_lot
  • inv_lot_transactions
  • INV_Onhand_Quantities
  • INV_Period_Close_Details
  • INV_Quantity_Tree_Pub
  • inv_reservations
  • iOS
  • iOS 6
  • ip address
  • iPad
  • ipconfig
  • iPhone
  • iPod
  • iresign
  • itunes
  • java
  • Join()
  • june
  • key flex field
  • Key Flex Fields
  • kff
  • labour
  • Latitude
  • Left()
  • level 50
  • Liberal Democrat
  • libraries
  • Lookup()
  • lot genealogy
  • materialized views
  • maximo
  • microsoft
  • microsoft app-v
  • microsoft exchange
  • microsoft paint
  • migration
  • MobileIron
  • Month()
  • MRP_Forecast_Dates
  • MRP_Forecast_Designators
  • msi
  • Mtl_Material_Status_History
  • MTL_System_Items_B
  • mtl_system_items_interface
  • mustek
  • N_Buffer
  • N_F_KFF_Flex_Sources
  • N_GSeg_Pkg
  • N_Gseg_Utility_Pkg
  • N_KFF_Ctlg_Grp
  • N_KFF_GL_Acct
  • N_KFF_Item_Loc
  • N_KFF_Mtl_Cat
  • N_KFF_Sys_Item
  • N_KFF_Sys_Item_Pkg
  • N_Role_View_Templates
  • N_View_Column_Property_Templates
  • N_View_Column_Templates
  • N_View_Columns
  • N_View_Query_Templates
  • N_View_Table_Templates
  • N_View_Templates
  • N_View_Where_Templates
  • N_Views
  • native-mode
  • ncm
  • NLS_Language
  • NLS_Territory
  • noetix
  • noetix customization maintenance
  • noetix views
  • Now()
  • OE_Order_Entry
  • OIE
  • open interface
  • open source software
  • opensource-it.com
  • opm
  • ORA-01795
  • Oracle
  • Oracle Alerts
  • oracle client
  • Oracle General Ledger
  • Oracle Internet Expenses
  • Oracle Payables
  • Oracle Process Manufacturing
  • oracle sql developer
  • orchard park
  • os x
  • os x lion
  • Outlook
  • parish councillor
  • Payables
  • pc line
  • pcl-3000
  • pl/sql
  • PO_Distributions_All
  • PO_Purchasing
  • PO_Vendor_Sites
  • PO_Vendors
  • port forwarding
  • quick guide
  • Recyclebin
  • Release 11
  • Release 12
  • remote server administration tools
  • Replace()
  • report builder 3
  • router
  • run as a different user
  • sap
  • scom
  • services
  • sharepoint
  • sharepoint 2007
  • sharepoint 2010
  • sharepoint content types
  • sharepoint document library
  • sharepoint integrated-mode
  • sharepoint native-mode
  • sla
  • smtp
  • sql server
  • sql server 2012
  • sql server analysis services
  • sql server integration services
  • sql server reporting services
  • ssas
  • ssis
  • ssrs
  • subledger accounting
  • subsidence
  • super hub
  • sysdate
  • system centre operations manager
  • telnet
  • test
  • textfile-search-and-replace
  • tnsnames.ora
  • town councillor
  • udid
  • ukip
  • umbraco
  • user accounts
  • User_Triggers
  • virgin media
  • vizual
  • vmware fusion
  • windows
  • windows 2003
  • windows 2008r2
  • windows 7
  • windows 8
  • windows 8 consumer preview
  • windows 8 server
  • windows update
  • windows vista
  • Wireless Drivers
  • wireless networking
  • wItem Installer
  • wnoetxu2.sql
  • wnoetxu5.sql
  • wnoetxu6.sql
  • work order
  • workflow builder
  • world of spectrum
  • xcode
  • XLA_Distribution_Links
  • xxk_mtl_cat
  • XXNAO
  • Year()
  • zool
  • zx spectrum

Blog Archive

  • ▼  2013 (43)
    • ▼  August (2)
      • Designing and Building your CMDB, Part 1: From Sys...
      • PL/SQL: Dynamically Building Your Data Archive
    • ►  June (1)
    • ►  May (2)
    • ►  April (8)
    • ►  March (3)
    • ►  February (14)
    • ►  January (13)
  • ►  2012 (63)
    • ►  December (2)
    • ►  October (1)
    • ►  September (4)
    • ►  August (4)
    • ►  July (5)
    • ►  June (6)
    • ►  May (3)
    • ►  April (4)
    • ►  March (10)
    • ►  February (11)
    • ►  January (13)
  • ►  2011 (65)
    • ►  December (8)
    • ►  November (8)
    • ►  October (7)
    • ►  September (9)
    • ►  August (9)
    • ►  July (9)
    • ►  June (6)
    • ►  May (2)
    • ►  March (1)
    • ►  February (5)
    • ►  January (1)
  • ►  2010 (9)
    • ►  December (1)
    • ►  November (3)
    • ►  September (1)
    • ►  July (1)
    • ►  June (1)
    • ►  February (2)
  • ►  2009 (9)
    • ►  December (1)
    • ►  November (1)
    • ►  August (1)
    • ►  July (1)
    • ►  May (3)
    • ►  March (1)
    • ►  February (1)
  • ►  2008 (11)
    • ►  November (2)
    • ►  October (1)
    • ►  July (1)
    • ►  May (1)
    • ►  April (2)
    • ►  February (1)
    • ►  January (3)
  • ►  2007 (4)
    • ►  December (4)
  • ►  2004 (1)
    • ►  December (1)
Powered by Blogger.

About Me

Unknown
View my complete profile