Saturday, August 11, 2012

Email task, Session and Workflow notification : Informatica

0 comments
One of the advantages of using ETL Tools is that functionality such as monitoring, logging and notification are either built-in or very easy to incorporate into your ETL with minimal coding. This Post explains the Email task, which is part of Notification framework in Informatica. I have added some guidelines at the end on a few standard practices when using email tasks and the reasons behind them.
1. Workflow and session details.
2. Creating the Email Task (Re-usable)
3. Adding Email task to sessions
4. Adding Email Task at the Workflow Level
5. Emails in the Parameter file (Better maintenance, Good design).
6. Standard (Good) Practices
7. Common issues/Questions
1. Workflow and session details.
Here is the sample workflow that I am using. The workflow (wkf_Test) has 2 sessions.
s_m_T1 : Loads data from Source to Staging table (T1).
s_m_T2 : Loads data from Staging (T1) to Target (T2).
The actual mappings are almost irrevant for this example, but we need atleast two sessions to illustrate the different scenarios possible.
Workflow Test with the two sessions.
Test Workflow (2 Sessions)
2. Creating the Email Task (Re-usable)
Why re-usable?. Becuase we’d be using the same email task for all the sessions in this workflow.
1. Go to Workflow Manager and connect to the repository and the folder in which your workflow is present.
2. Go to the Workflow Designer Tab.
3. Click on Workflow > edit (from the Menu ) and create a workflow variable as below (to hold the failure email address).
Failure Email workflow variable
Failure Email workflow variable
4. Go to the “Task Developer” Tab and click create from the menu.
5. Select “Email Task”, enter “Email_Wkf_Test_Failure” for the name (since this email task is for different sessions in wkf_test).
Click “Create” and then “Done”. Save changes (Repository -> Save or the good old ctrl+S).
6. Double click on the Email Task and enter the following details in the properties tab.
Email User Name : $$FailureEmail   (Replace the pre-populated session variable $PMFailureUser, 
                                    since we be setting this for each workflow as needed).
Email subject   : Informatica workflow ** WKF_TEST **  failure notification.
Email text      : (see below. Note that the server varibles might be disabled, but will be available during run time).
Please see the attched log for Details. Contact ETL_RUN_AND_SUPPORT@XYZ.COM for further information.
 
%g
Folder : %n
Workflow : wkf_test
Session : %s
Create Email Task
Create_Email_Task
3. Adding Email task to sessions
7. Go to the Workflow Tab and double click on session s_m_T1. You should see the “edit task” window.
8. Make sure you have “Fail parent if this task fails” in the general tab and the “stop on errors” is 1 on the config tab.
Go to “Components” tab.
9. For the on-failure email section, select “reusable” for type and click the LOV on Value.
10. Select the email task that we just created (Email_Wkf_Test_Failure), and click OK.
Adding Email Task to a session
Adding Email Task to a session
4. Adding Email Task at the Workflow Level
Workflow-level failure/suspension email.
If you are already implementing the failure email for each session (and getting the session log for the failed session), then you should consider just suspending the workflow. If you don’t need session level details, using the workflow suspension email makes sense.
There are two settings you need to set for Failure notification emails at workflow level.
a) Suspend on error (Check)
b) Suspension email (Select the email task as before). Again, remember that if you have both session and workflow level emails, you’ll get two emails, if a session fails and causes the parent to fail.
Informatica workflow suspension email
Informatica workflow suspension email
Workflow Sucesss email
In some cases, you might have a requirement to add a success email once the entire workflow is complete.
This helps people know the workflow status for the day without having to access workflow monitor or asking run teams for the status each day. This is particularly helpful for business teams who are more concerned whether the process completed for the day.
1) Go to the workflow tab in workflow manager and click Task > Create > Email Task.
2) Enter the name of the email task and click OK.
3) In the general tab, select “Fail parent if this task fails”. In the properties tab, add the necessary details
Note that the variables are not available anymore, since they are only applicable at the session level.
4) Add the necessary Session.status=”succeedeed” for all the preceding tasks.
Here’s how your final workflow will look.
Success Emails
Informatica success emails
5. Emails in the Parameter file (Better maintenance, Good design).
We’ve created the workflow variable $$FailureEmail and used it in the email task. But how and when is the value assigned?
You can manage the failure emails by assigning the value in the parameter file.
Here is my parameter file for this example. You can seperate multiple emails using comma.
infa@ DEV /> cat wkf_test.param
[rchamarthi.WF:wkf_Test]
$$FailureEmail=rajesh@etl-developer.com
 
[rchamarthi.WF:wkf_Test.ST:s_m_T1]
$DBConnection_Target=RC_ORCL102
 
[rchamarthi.WF:wkf_Test.ST:s_m_T2]
$DBConnection_Target=RC_ORCL102
While it might look like a simpler approach initially, hard-coding emails IDs in the email task is a bad idea. Here’s why.
Like every other development cycle, Informatica ETLs go thorugh Dev, QA and Prod and the failure email for each of the environment will be different. When you promote components from Dev to QA and then to Prod, everything from Mapping to Session to Workflow should be identical in all environments. Anything that changes or might change should be handled using parameter files (similar to env files in Unix). This also works the other way around. When you copy a workflow from Production to Development and try to make changes, the failure emails will not go to business users or QA teams as the development parameter file only has the developer email Ids.
If you use parameter files, here is how it would be set up in different environments once.
After the initial set up, you’ll hardly change it in QA and Prod and migrations will never screw this up.
In development   : $$FailureEmail=developer1@xyz.com,developer2@xyz.com"
In QA / Testing  : $$FailureEmail=r=developer1@xyz.com,developer2@xyz.com,QA_TEAM@xyz.com
In Production    : $$FailureEmail=IT_OPERATIONS@xyz.com,ETL_RUN@xyz.com,BI_USERS@xyz.com
6. Standard (Good) Practices
These are some of the standard practices related to Email Tasks that I would recommend. The reasons have been explained above.
a) Reusable email task that is used by all sessions in the workflow.
b) Suspend on error set at the workflow level and failure email specified for each session.
c) Fail parent if this task fails (might not be applicable in 100% of the cases).
c) Workflow Success email (based on requirement).
d) Emails mentioned only in the parameter file. (No Hard-coding).
7. Common issues/Questions
Warning unused variable $$FailureEmail and/or No failure emails:
Make sure you use the double dollar sign, as all user-defined variables should. (unless you are just using the integration service variable $PMFailureEmailUser). Once that is done, the reason for the above warning and/or no failure email could be…
a) You forgot to declare the workflow variable as described in step 3 above or
b) the workflow parameter file is not being read correctly. (wrong path, no read permissions, invalid parameter file entry etc.)
Once you fix these two, you should be able to see the success and failure emails as expected.
newer post

Informatica Unable to fetch log

0 comments
Quite often, you might come across the following error when you try to get the session log for your session.
Unable to Fetch Log.
The Log Service has no record of the requested session or workflow run.
The first place to start debugging this error would be one-level up from the session log, which is the workflow log.
The most common reasons I have seen this happen is because of the following .
a ) One or more of the following parameters have been specified incorrectly.
  • Session Log File directory
  • Session Log File Name
  • Parameter Filename (at the session (and/or) workflow level)
b) You do not have the necessary privileges on the directory to create and modify (log) files.
Whatever be the case , the workflow log is your next point of debugging. In my test scenario , I entered the following parameters for the log file name and directory to simulate this error.
Session Log File directory : $InvalidLogDir\
Session Log File Name : s_m_test_cannot_fetch_log.log
When I ran the workflow, the session failed and I could not get the session log (becuase it was never created) . The error in the workflow log is as follows.
Session task instance [s_m_test_cannot_fetch_log] : 
[CMN_1053 [LM_2006] Unable to create log file
[$InvalidLogDir/download/INFA/QuickHit/ParmFiles/s_m_test_cannot_fetch_log.log.bin].
There seems to be too much guesswork to fix this error based on the posts on internet forums . Somehow, developers seem to think of this error as something wrong with the Informatica client installation.
The next time you get this error, please check your workflow log.
If you have seen this happen before for another reproducible case , please comment and I’ll modify the post to include the same if needed.
newer post

Informatica Workflow Successful : No Data in target !

0 comments
This is a frequently asked in the Informatica forums and the solution is usually pretty simple. However, that will have to wait till the end because there is one important thing that you should know before you go ahead and fix the problem.
Your workflow should have failed in the first place. If this was in Production, Support Teams should know that something Failed. Report users should know the data in the Marts is not ready for reporting. Dependent workflows should wait until this is resolved. This coding practice basically violates the Age-old Principle of fail-fast when something goes wrong, instead of continuing flawed execution pretending “All is well”, causing the toughest-to-debug defects.
Of Course, this is not specific to Informatica. It is not uncommon to see code in other languages which follows this pattern. The only issue that is specific to Informatica is that this is the default behavior when you create a session. So you might have this “bug” in your code without even knowing it.
Stop On Errors:
Indicates how many non-fatal errors the Integration Service can encounter before it stops the session. Non-fatal errors include reader, writer, and DTM errors. Enter the number of non-fatal errors you want to allow before stopping the session. The Integration Service maintains an independent error count for each source, target, and transformation. If you specify 0, non-fatal errors do not cause the session to stop.
Optionally use the $PMSessionErrorThreshold service variable to stop on the configured number of errors for the Integration Service.
In Oracle, it is the infamous “when others then null” .
BEGIN
  <process SOME Data>
exception
   WHEN others 
       THEN NULL;  
END;
/
In Java..Something like..
try {
   fooObject.doSomething();
}
catch ( Exception e ) {
   // do nothing
}
The solution to this problem in Informatica is to set a limit on the number of allowed errors for a given session using one of the following methods.
a) Having “1″ in your default session config : Fail the session on the first non-fatal error.
b) Over-write the session Configuration details and enter the “Stop On Errors” to “1″ or a fixed number.
c) Use the $PMSessionErrorThreshold variable and set it at the integration service level. You can always override the variable in the parameter file. Take a look at this Article on how you can do that.
Remember, if your sessions do not belong to one of these categories, you are doing it wrong!.
a) Your session Fails and Causes the workflow to fail whenever any errors occur.
b) You allow the session to continue despite some (expected) errors, but you always send the .bad file and the log file to the support/business team in charge.
Why is there no data in Target
The solution to “why the records didn’t make it to the target” is usually pretty evident in the session log file. The usual case (based on most of the times this question is asked) is becuase all of your records are failing with some non-fatal error.
The only point of this article is to remind you that your code has to notify the right people when the workflow did not run as planned.
newer post

ORA-01403: no data found

0 comments
Pretty common oracle error. Raised when you are trying to fetch data from sql into a pl/sl variable and the sql does not return any data.
>> Using the data from this schema
 
SQL> SELECT COUNT(*)
  FROM scott_emp
  WHERE empno = 9999;
 
  COUNT(*)
----------
         0
 
SQL> DECLARE
   l_ename scott_emp.ename%TYPE;
   l_empno scott_emp.empno%TYPE := 9999;
BEGIN
   SELECT ename
     INTO l_ename
     FROM scott_emp
     WHERE empno = l_empno;
END;
/
 
DECLARE
*
ERROR at line 1:
ORA-01403: no DATA found
ORA-06512: at line 5
What to do next
1. Re-Raise it with a error message that provides more context.
DECLARE
   l_ename scott_emp.ename%TYPE;
   l_empno scott_emp.empno%TYPE := 9999;
BEGIN
   SELECT ename
     INTO l_ename
     FROM scott_emp
     WHERE empno = l_empno;
EXCEPTION
  WHEN no_data_found
   THEN raise_application_error(-20001,'No employee exists with employee id ' || l_empno);
END;
/
ERROR at line 1:
ORA-20001: No employee EXISTS WITH employee id 9999
ORA-06512: at line 11
2. Suppress the error if this is a valid business scenario and do the necessary processing.
Example CASE : If a user has a preference to display the numbers in local currency, convert the amount, else, display in USD.
 
CREATE OR REPLACE PROCEDURE p_calc_sales_metrics(
   p_user_id IN users.user_id%TYPE,
   p_profit  IN net_sales.profit%TYPE
) AS
  l_pref_currency user_prefs.pref_currency%TYPE;
  l_profit_local_amt net_sales.profit%TYPE;
BEGIN
 
---other code
 BEGIN
 
 SELECT pref_currency
   INTO l_pref_currency
  WHERE user_id = p_user_id
 
        l_cur_conv_factor := get_conv_rate('USD',l_pref_currency);
 
 
 exception
   WHEN no_data_found 
    THEN l_cur_conv_factor := 1;
 END;
 
--- other code..
 
 l_profit_local_amt :=  p_profit * l_cur_conv_factor;
 
 
END;
/
3. Functions, by design, do not raise the NO_DATA_FOUND exception, instead they return null to the calling program.
CREATE OR REPLACE FUNCTION STGDATA.f_get_ename(
   i_empno IN scott_emp.empno%TYPE
) RETURN scott_emp.ename%TYPE
AS
  l_ename scott_emp.ename%TYPE;
BEGIN
 
  SELECT ename
    INTO l_ename
    FROM scott_emp
   WHERE empno = i_empno;
 
  RETURN l_ename;  
 
END;
/
 
SQL> SELECT f_get_ename(7839) FROM dual;
 
F_GET_ENAME(7839)
---------------------
KING
 
SQL>  SELECT f_get_ename(9999) FROM dual;
 
F_GET_ENAME(9999)
-------------------------------------------
 
 
SQL> SELECT nvl(f_get_ename(9999),'NULL RETURNED') FROM dual;
 
NVL(F_GET_ENAME(9999),'NULLRETURNED')
------------------------------------------------
NULL RETURNED
newer post

Friday, August 10, 2012

Introducing Informatica Cloud 9 – The Defining Capability For Cloud Computing

0 comments
Today we made an announcement called Informatica Cloud 9.  This is the culmination of many years of hard work and effort and builds on the Informatica 9 announcement we made last week.  So what is so special about Informatica Cloud 9?  Is it the new Platform-as-a-Service offering?  Or it is the new Cloud Services we delivered?  Or is it the new capabilities on Amazon EC2?  What are all these things and why are they important?
Let me explain:
Informatica Cloud 9 started over four years ago when we noticed the beginnings of a revolution happening around us – namely Cloud Computing.  Many of you may not be familiar with our work in the Cloud, but we have been very focused on delivering data integration as a “Software-as-a-Service” (SaaS) solution.  This has meant taking our enterprise class capabilities and simplifying the interface to an extent where a simple non-technical business user can point-and-click to connect a cloud application with an on-premise application.
We have always believed that this is critical to realize since business users typically put off the integration tasks because they don’t want to rely on IT for time or resources. So we wanted to make it incredibly easy for the business user to do this on their own.  We focused on salesforce.com and built data movement and then data synchronization capabilities.  Indeed, our Data Loader Service for salesforce.com was voted the best data integration solution on their AppExchange this year.
However, our belief is that while business users must be able to define an integration task, their IT colleagues should be able to see the same integration processes from within their environment.  Only then will the business user be able to truly bring new applications into critical usage.  The same needs to be true the other way around – we want to be able to define complex integrations and enable business users to be able to run them through an easy-to-use browser interface.  Only then will the CFO, and others, be confident that they can trust the data being deployed across public and private clouds and begin to embrace cloud computing for core business requirements.  We call this Business-IT collaboration and it was a big part of last week’s Informatica 9 announcement as well.
Cloud computing is re-defining IT and data integration needs to follow-suit.  It is data integration that will be THE defining capability for cloud computing – not outsourced datacenters, or sexy new application solutions.  So, to really embrace cloud computing, one needs a whole new way of delivering enterprise data integration that brings together the ease of use that business users require with the sophistication that must be delivered for IT architects. Otherwise cloud computing will simply remain the domain of non-critical fancy-looking applications on the periphery of true enterprise business requirements.
Informatica Cloud 9 is a significant step forward in solving this problem of providing data integration in the clouds.  With today’s announcement, anyone who is involved with Informatica can build, share and deploy any data integration components and deploy them anywhere. These components may relate to data quality or data integration or indeed, with time, any of the other components that make up the Informatica 9 Platform. After all, Informatica Cloud 9 is built on the comprehensive and unified Informatica 9 and therefore will eventually inherit the core capabilities of the platform.
Informatica Cloud 9 delivers on this belief and provides three critical components towards that goal:
  • With Data Quality Cloud Edition and PowerCenter Cloud Edition on Amazon EC2 we are providing a low-cost hourly build capability for IT users. With this, developers can build complex integrations between applications that can be published to non-technical line-of-business managers to consume and manage.  Indeed any of the 50,000+ developers on the Informatica TechNet, or any of our Systems Integration partners will be able to do this.  These integrations can be thought of as templates – picked up by anyone using the Informatica Cloud 9 Platform and re-deployed.
  • With the Informatica Cloud 9 Platform-as-a-Service we are providing the multi-tenant, scalable enterprise engine for deploying data integration in the clouds. One note that you may not be familiar with – we are already running over 17,000 jobs a day through our multi-tenant Informatica Cloud Services and moving over three Billion rows a month of client data.
  • With our new Informatica Cloud 9 services we are enhancing our own simple-to-use suite of data integration cloud applications that continue to evolve the role of the business user to be self-sufficient in their approach to accessing and integrating trustworthy cloud-based and on-premise data.
Now an enterprise can enable business users to use any cloud application and remain in control of their most critical asset – their data.  Developers can share re-usable templates across the business; System Integrators can build data integration templates for specific cloud and on-premise applications and deploy them across to their clients; consultants can move from client to client with toolboxes of pre-configured templates.
Informatica Cloud 9 is the evolution of enterprise data integration to the clouds.  Take a few moments please to re-read the press release and, in particular, the quotes therein:
  • “Informatica Cloud 9 will dramatically simplify cloud-to-cloud and cloud to on-premise data integrations…”
  • “… the ability to develop more complex mappings and workflows and run them as custom services for line of business managers will allow us to continue to provide self-service, while IT remains in control…”
  • “… we’ve developed an SAP data integration as a service solution…”
  • “… we plan to develop re-usable templates to accelerate time to market and reduce total cost of ownership for our customers… “
  • “Informatica Cloud Platform gives us the power and flexibility to meet enterprise requirements and deliver solutions to non-technical business users …”
Hopefully now you can see why we are all on Informatica Cloud 9 here!
newer post

How Big Data Changes Data Integration

0 comments
With Big Data systems now in the mix within most enterprises, those charged with data integration are interested in how their world will soon change. Rest assured, most of the patterns of integration that we deal with today will still be around for years to come.
However, there are some clear trends that data integration managers need to understand, such as:
  • The ability to imply structure to the data at the time of use.
  • The ability to store both structured and unstructured data.
  • The need for faster data integration technology.
The ability to imply structure to the data at the time of use refers to the fact that Big Data systems using the Hadoop set of technologies have the ability to add a structure at the time of use. Thus, you don’t need to pre-define a structure as we do in the world of relational data, you can map a structure to existing data.
While this has certain advantages, such as the ability to create dynamic structure around in-line analytical services, this also causes some complexity when dealing with data integration technology. Most data integration technology leverages some type of structure on either end of the integration flow. The idea is that you need to layer a structure as the data is consumed, translated, and produced from one system or data store to another.
The ability to store both structured and unstructured data, as related to the layering in a dynamic structure, brings both complexity and flexibility. Big Data systems are basically file systems with anything and everything stored in them. This means that documents, text, and data are all intermingled. This information may be bound to a structure, or freestanding.  In any event, you need to provide the ability to move both structured and unstructured data from store to store.
The need for faster data integration technology is a result of the fact that we deal with much larger volumes of data than more traditional enterprise systems. Therefore, there is more data that has to be moved from data store to data store. Thus, there is a renewed focus on data integration technology’s ability to keep up with the data integration performance requirements.
In many respects, the ability to create a data integration solution that is able to move larger volumes of structured and unstructured data between data stores is dependent upon the way you’ve designed the data integration flows, as much as the data integration technology itself. As Big Data systems move into your enterprise, and you join them together using data integration technology, you’ll find that the patterns of the integration flows need to change as well. Before these systems are put into production, it’s a good idea to review what needs to change and best practices around the design of the integration flows.
Big Data is more of an evolution around the way we store and deal with data. It provides more primitive commodity mechanisms that provide more flexibility and the ability to deal with larger amounts of data using highly distributed data management technology. Data integration technology needs to adapt to this change, which is further reaching than anything we’ve seen of late.
newer post

How Integration Platform-as-a-Service Impacts Cloud Adoption

0 comments
Did you know that Forrester estimates in their 10 Cloud Predictions For 2012 blog post that on average organizations will be running more than 10 different cloud applications and that the public Software-as-a-Service (SaaS) market will hit $33 billion by the end of 2012?
However, in the same post, Forrester also acknowledged that SaaS adoption is led mainly by Customer Relationship Management (CRM), procurement, collaboration, and Human Capital Management (HCM) software and that all other software segments will “still have significantly lower SaaS adoption rates”. It’s not hard to see this in the market today, with cloud juggernaut salesforce.com leading the way in CRM, and Workday and SuccessFactors doing battle in HCM, for example. Forrester claims that amongst the lesser known software segments, Product Lifecycle Management (PLM), Business Intelligence (BI), and Supply Chain Management (SCM) will be the categories to break through as far as SaaS adoption is concerned, with approximately 25% of companies using these solutions by 2012.
I am not at all surprised that CRM, and HCM are leading the way as far as SaaS application adoption is concerned. One only needs to examine the reason behind why these categories took off. During the so-called “Great Recession,” companies wanted an efficient way in which to grow revenues and cut costs. On the revenue side of the equation, companies found that sales force automation (SFA) helped them close more deals faster, and increased customer visibility allowed them to focus on customer retention as well as potential upsell and cross-sell opportunities. On the costs side, some have argued that functions such as HR and Talent Management were the first to be moved to the cloud as they were considered “non-core”.
As data volumes, global deployment, and end-user adoption grew, it became increasingly clear that out-of-the-box CRM or HCM functionality was not going to cut it, and that customization options would be necessary. Out of this necessity evolved the world of Platform-as-a-Service (PaaS). Similar to the concept of SaaS, a PaaS environment involves built-in scalability, reliability, security, databases, interfaces to web services and a container with development tools for building custom apps. Salesforce.com was one of the early creators of this new cloud ecosystem with its Force.com platform. This platform, which now numbers over 220,000 apps (as of the publication of this blog post) provides numerous options to customize a CRM deployment as well as build websites, and numerous productivity-enhancing and vertical specific apps from the ground up that tied into the core CRM functionality.
While the PaaS ecosystem provides a great avenue to build custom apps and increase cloud application adoption, too often it is tied into the code-base of the dominant SaaS player that brought it into existence. As a result, other non-CRM and non-HCM functions such as PLM, SCM, BI, and ERP still largely remain in the on-premises world.
This is where iPaaS, or integration PaaS comes into play. Each of these other non-CRM functions is an important part of the value chain, whether upstream, or downstream. PLM and SCM systems for instance interact frequently with ERP systems. BI and analytics software have multiple touch points with all these systems. Integrating all these systems together and tying them to specific customer records in the CRM system has been such a time-consuming task that most SaaS providers simply chant the mantra of “web services” when asked by customers how they can connect various SaaS ecosystems together. Web services typically accomplish a very specific business process and specific task between two different SaaS applications, and the web services APIs do not lead to repeatability. In fact, a Slashdot blog on the API economy mentioned that there were some 5,000 APIs estimated by the end of 2012 and some 30,000 estimated in the next four years.
The proliferation of APIs along with SaaS adoption only strengthens the need for an integration PaaS that abstracts the underlying orchestrations of these APIs to end users. An integration PaaS, allows developers to build full (or partial, if desired) native connectivity to every single object within an application, whether SaaS or not. By building native connectors, every permutation and combination of objects between different SaaS applications is possible, thereby increasing the possibility for companies to choose those SaaS apps that fit their business function or department. With increasing confidence of the existence of the integration PaaS, companies will continue to adopt SaaS apps in other LOBs, and not just the mainstream CRM or HCM categories. This in turn spurs the other SaaS category providers to invest more in R&D and come out with even more advanced functionality.
With the increased innovation occurring across all SaaS applications, we can expect more and more complex use cases involving larger amounts of data. All of this coupled with custom apps built on competing PaaS platforms will only further increase the use of an integration PaaS to achieve cloud data integration.
newer post

Delivering IT Value with Master Data Management and the Cloud

0 comments
Over the last few years most enterprises have implemented several (if not more) large ERP and CRM suites. Although these applications were meant to have self-contained data models, it turns out that many enterprises still need to manage “master data” between the various applications. So the traditional IT role of hardware administration and custom programming has evolved to packaged application implementation and large scale data management.  According to Wikipedia: “MDM has the objective of providing processes for collecting, aggregating, matching, consolidating, quality-assuring, persisting and distributing such data throughout an organization to ensure consistency and control in the ongoing maintenance and application use of this information.” Instead of designing large data warehouses to maintain the master data, many organizations turn to packaged Master Data Management (MDM) packages (such as Informatica MDM). With these tools at hand, IT shops can then build true Customer Master, Product Master (Product Information Management – PIM), Employee, or Supplier Master solutions.
MDM solutions vary by industry in terms of tactical approaches taken – e.g., pharmaceutical/life sciences will adopt semi-batch, database-centric approaches for master physician data to be deployed to sales forces, while financial services providers and online retailers will require near real-time, business process-centric solutions to compete in the business-to-consumer (B2C) online world. These different types of implementations require technical IT expertise in delivering an end-to-end solution.  Based on quarterly surveys of the MDM Institute Business Council™ (8,000+ subscribers to the MDM Alert newsletter engaged in MDM projects), the perennial top four business drivers for MDM initiatives are summarized as:
(1)    compliance and regulatory reporting;
(2)    economies of scale for mergers and acquisitions (M&A);
(3)    synergies for cross-sell and up-sell;
(4)    legacy system integration and augmentation; and
Note that this list represents business drivers, not technical initiatives. Ideally, the business analyst “owns” the data and is responsible for the initial definition of what the master data looks like (whether this is from a custom application or a packaged solution). In addition, they are responsible for the processes (not actual data entry) of inputting the data into the source systems. IT acts as “data stewards” – coordinators between various business groups. IT’s role should be project managers that phase in updates to the primary MDM. These data stewards must be equally savvy in data modeling as well as business processes.  IT must also be the technical gurus to glue applications and databases together. This also involves data quality processes, such as standardization, cleansing, validation, enrichment and matching.
Traditional MDM solutions have been implemented on premise, primarily as data hubs to various applications spokes such as Human Resources, PLM, ERP, and CRM applications. With the huge uptick of software as a service (SaaS) CRM providers such as salesforce.com, this requires MDM solutions to integrate data from the cloud.
While an on-premise model works well when most of the data is updated within the “four walls” of the enterprise, a hybrid cloud + on premise model may be better suited to a B2C environment when massive customer updates happen on a seasonal basis. In this case, a hybrid model will allow for extra cloud resources to be tapped in order to increase performance. In addition, with a hybrid model, sensitive data that may be legally prohibited from residing in the cloud can be kept on premise.
Should MDM be completely implemented in the cloud?
In this case, the master data model engine will reside in the cloud and will act as a hub between multiple SaaS applications and potentially on premise applications. A common scenario might be managing customer data between Salesforce CRM, Order fulfillment with UPS services, and on-premise ERP Receivables. Or replace the on-premise ERP solution with a cloud-based ERP such as NetSuite. In these cases, having MDM in the cloud might be the right approach. A cloud-based solution also makes sense for piloting a longer term MDM project. So look for a vendor that provides both on-premise and cloud-based MDM solutions for maximum deployment flexibility.
The Hybrid IT organization continues to evolve with new responsibilities. Cloud-based solutions tend to free up the IT staff from the more routine data center operations to get more involved with business activities such as Master Data Management. IT will play an important role in managing MDM solutions. Although, they don’t “own” the data, the technical requirements for implementing a solution remain in the IT domain. And acting as a data steward to capture the business requirements of what data needs to be managed and formulate the detailed rules and processes will become a key role. IT will also need to decide between on-premise and cloud-based architectures for the enterprise.
—-
Mercury Consulting is a trusted technology advisor with deep expertise in cloud applications. We offer strategic guidance to senior executives to select the right cloud solution and services assistance to help enterprises accelerate their adoption of cloud solutions.
Mike Canniff is a faculty member of Management Information Systems at the University of Pacific – Eberhardt School of Business. He has worked in the Information Technology field for over 20 years beginning with IBM as a software engineer and as Vice President, Development for Acuitrek Software. Mike has specialized his career research in the areas of Enterprise Application Integration and Electronic Commerce systems. He has published several papers on Electronic Commerce and Business Process Management best practices.
newer post

Electronic Trading Systems Moving to the Cloud

0 comments
More and more business applications are moving from the desktop to the cloud, and electronic trading applications are no different.
Over the last five or ten years, application vendors have established several advantages of running major applications, even mission-critical applications like salesforce.com, over the cloud.
These advantages include:
  • Easier and smoother upgrades, which provides much better adaptability and agility in the face of changing market and business conditions, plus a better user experience,
  • Better scalability, with newer technology advances, and
  • Better portability across a wide array of device types, including smartphones and tablets (especially in the last 2-3 years).
Recent improvements in Web technology, such as HTML5 WebSockets, are helping to speed this transition along by providing several throughput and latency advantages over earlier iterations of Web technology, and even over native Windows applications. Now, application architects can freely choose the technology that provides a better path for growth, agility, and scalability, which is often a Cloud-based solution.
As I write this, a few of our customers who provide electronic trading solutions to their clients are making the strategic move to develop a next generation application based in the Cloud. The main driver for one customer was to be able to take on more clients more quickly and therefore grow the business faster by increasing marginal revenue and profitability. They found that the list of challenges with a thick desktop client to be just too big for growing the business as quickly as they wanted to — or needed to.
Messaging middleware, especially peer-to-peer solutions such as Informatica Ultra Messaging, can be a very important piece of a Cloud-based application. The peer-to-peer “nothing in the middle” model provides applications not just ultra-high performance (whether for high throughput or low latency), but also near-linear scalability, true 24×7 reliability and availability, and business and IT agility. These qualities tie directly to the advantages listed above.
Cloud-based applications, of course, must also contend with the Internet and all that comes with that: support for various browsers and platforms (and versions of each), scalability and bandwidth issues, and mobile devices like smartphones and tablets. New web technologies like HTML5 WebSockets from Kaazing are best positioned to take care of the path from server to the smartphone or tablet, and with JMS connectivity to Ultra Messaging on the back end, can provide a Cloud-based application with a lean, scalable and agile infrastructure, usually with less hardware.
newer post
newer post older post Home