How Python, APIs, coffee, and a puppy helped regain trust in HR data

A large O&G organization was implementing SAP Success Factors (SF) but was struggling during the implementation due to different sources of HR data with no data governance. After the implementation, all SAP SF generated reports were not consistent with other systems, and therefore, upper management did not rely on this data to make decisions.

Main issue

Management’s lack of trust in a newly implemented system and data living in that system generated many issues, but mainly the concern of a large investment done, with no benefit. HR department had to make sure a detailed audit was done in order to identify, fix and explain all data gaps. Even though many gaps were not created by HR department, it was during this system implementation that they were identified, and therefore, all eyes were on the HR department.

From simple reports such as headcounts to reports that provided insights regarding oil field engineering talent across the organization and headcount forecasting, all these reports rely on reliable data to help management make decisions.

Due to challenges with the data, multiple users were doing data extracts (Excel) on each system and making comparisons in Excel. Still, data changed every day, so this approach was creating a lot of repetitive work and a bit of a nightmare, adding yet another issue caused by bad data.

Enroute’s solutions

Generate automated daily reports that could explain the differences between systems (in 2 days).

  • Since the SAP SuccessFactors solution was on the cloud, we used their API to pull the correct data (using a python script) and had a connection to a business warehouse where SAP EC2 and Fieldglass data was stored.
  • The script had logic to look at critical attributes, map them accordingly to have a report that could let us know if a record was found, in which of the three systems it was found, if it was found in all three systems, did it have the right data across all three systems or not, etc.
  • These reports would not only help clean data but could be used in the future as a data quality baseline to make sure data quality is monitored.
  • Data cleansing/enriching actions

  • Based on the reports from above, define actions took place every week to clean/enrich data. It was vital to work with users and issues with multiple cases so that we could tackle more instances with fewer actions.
  • Root cause analysis

  • Based on all identified issues, we perform an analysis to make sure that not only the data was fixed, but that no new issues were being brought.
  • Note: During this analysis, it was identified that the integration between systems was not working correctly, and therefore it was generating more bad data each day
    These actions would help clean existing data, avoid any new bad data from being generated, and help have reliable HR data for management to be able to make business decisions.

    Business Impact

  • Close data discrepancies from a +30% gap to a 1% gap.
  • Reduce the amount of time spent by key users on repetitive tasks such as generating Excel data comparisons with an automated report.
  • Increased reliability in systems/data by cleaning data and generating automated data quality reports.

  • *If you want to learn how the puppy helped, please email us at, and we’ll explain everything in more details.