Effectiveness of Digital Transformation in Data Warehouse Technology

Abstract


A. Introduction
The amount of bank data is growing exponentially due to developments in information technology (IT).This size of data, which is impossible to manage in traditional databases, can help in the decision-making process.Data warehouse A data warehouse is a collection of data with the characteristics of being integrated, subject-oriented, time-variant, and non-volatile [8] [9] [10].Processed data can support the decision-making process for management.The Online Analytical Processing (OLAP) method is a merging process based on dynamic analysis and consolidation for large multidimensional data so that it can support interactive examination and manipulate data from various perspectives [1].Several DWH studies have been carried out with different approaches in design and implementation, from specific aspects such as the use of the Nine Step Design Methodology to application for multidimensional analysis and business intelligence [11] [12] [13].
The problem with Bank ABC is that there is a need for data processing for integration, scalability, upgrading, and updating the DWH system.Therefore, research was carried out to design a transformation model and test its effectiveness on DWH technology.This process is carried out in a single staging environment so that data management becomes integrated, processing is efficient, and the decision-making support process is fast and accurate.Apart from that, it can also increase customer satisfaction [3] [14].Therefore, improving the DWH consolidation process with a Performance Measurement System (PMS) can have a positive impact on banking operations and customer service [10] [15] [16].

B. Research Method
This research begins with the stages of problem analysis and identification, determining research objectives, preparing raw materials in the form of the old ETL process, working on the ETL process, testing and validation, and reporting (Figure 1).The improvement process approach is carried out using the convert method.The conversion mechanism for a PMS job is to switch the source that previously used AS400 to an Oracle database, as shown in Figures 2 and 3.
The process of completing the DWH design to increase the efficiency of the PMS process (Figure 3) begins with a pre-assessment, namely.At this stage, a list of ETL jobs that use AS400 is created, along with a list of tables sourced from AS400 in the PMS process.This information will be the basis for the next steps in changing connections to the PMS ETL job, especially in importing data from the AS400 to the systems reference code (SRC) scheme in the PMS database.
The next step is to backup all existing jobs on the DataStage v11.5 server.Backup was also carried out on all open database connectivity (ODBC), TNS, and DB2 catalogs on the DataStage 11.5 server.This step is taken to ensure data security and integrity before making significant connection changes to the system.
After the backup is complete, connection changes to the ETL PMS job are continued using the transparent network substrate (TNS), which is already available in the database configuration.This stage is very important to ensure the smooth operation of all ETL jobs, including verification regarding whether the output results are in accordance with the previous method and system environment.The implementation of this change is intended so that the system can run more efficiently and in accordance with the latest needs of the organization.Currently, the operational system describes the running conditions as shown in Figure 4.There are two main system sources, namely non-AS400 systems and AS400 systems, which contribute data to the PMS process.The process of withdrawing data from non-AS400 systems is carried out through the ETL mechanism, while data originating from text files is retrieved using the secure file transfer protocol (SFTP) mechanism (G et al., 2020).
On the other hand, retrieving data from the AS400 system is divided into two methods.First, the data is retrieved and synchronized into Oracle's operational data store-data warehouse (ODS-DWH) via the ETL mechanism.Second, there is a special process using the VTL (Virtual Tape Library) mechanism (Yan et al., 2015) (Iliadis, Jordan, Lantz, & Sarafijanovic, 2022), namely that data from the AS400 system is processed by backing up and restoring.
Furthermore, in the context of data processing, it can be seen that the process of pulling and processing data from ODS-DWH Oracle to Data Mart generally uses the DataStage platform.Likewise with the process of withdrawing and processing data from the STG-AS400 to Data Mart, which also relies on the DataStage platform.Figure 6 provides a visual overview of the architecture and current process flow.The use of DataStage as a platform for data processing is characteristic of this process, demonstrating good integration in managing and transforming data from various sources for PMS needs.

Figure 5. Proposed Conditions
The PMS process includes three main stages.The first stage is to withdraw data from a non-AS400 system using the ETL mechanism.This step ensures efficient integration of data into the PMS through data extraction, transformation, and movement.The second stage involves pulling data from the AS400 system to ODS-DWH Oracle via an ETL mechanism, ensuring synchronization and integration of data from both systems.The final stage is pulling and processing data from ODS-DWH Oracle to Data Mart using the DataStage platform.This process is illustrated in Figure 5, providing a visual representation of the proposed system architecture and workflow.Technically, this process is described in a flowchart and also for the testing and validation process (Figure 6).

C. Result and Discussion
Data Warehouse Update.The first stage in the ETL process is data extraction from the newly defined ODS-DWH-Oracle database.In this context, it is important to create a database structure that suits your needs.To simplify integration, a schema mapping was carried out from DB2 to Oracle, and the relevant schema changes can be seen in Table 1.The next step is the transformation stage (Figure 7).There are two approaches used in this process.First, the transformation was performed using the DataStage ETL tool, and second, using queries to handle differences in language syntax between DB2 and Oracle.

Figure 7. Transformation process
The final stage of the ETL process is loading, i.e., the data that has been extracted and transformed is placed in DWH as the final destination.This data can then be used by PMS, including for reporting purposes to Bank Indonesia.The ETL process involves the steps of data extraction from sources, transformation to maintain data consistency and clarity, and data placement in DWH.
First, the extraction stage in the ETL PMS process focuses on extracting data from the old database (DB2) to the new database (Oracle) so that it is integrated in ODS-DWH-Oracle.This process requires detailed mapping of schema changes to ensure accurate data transitions.Table 1 provides an overview of the schema changes from the AS400 database to Oracle and also provides an important basis for creating more efficient and tailored ETL jobs.
Second, in the transformation stage, two approaches are used: 1) direct transformation is carried out in the DataStage ETL tool, and 2) there is a change in language syntax between DB2 and Oracle.This transformation process, illustrated in Figure 8, is a crucial step to ensure the extracted data is converted to the correct format before being loaded into the DWH.
The final stage is loading, the data that has been transformed is placed in DDWH.This step is very important because it ensures that the data is available for use by the Performance Measurement System (PMS) for reporting purposes, including reports that must be submitted to Bank Indonesia.Thus, the entire ETL process carries out a series of integrated stages, from extraction to data loading, to effectively support the functions and needs of the PMS.
Next, for the work stage, the first step taken in implementing changes to the ETL system is Backup Job Existing.This stage involves creating backup copies of all existing ETL jobs on the DataStage v11.5 server and ODBC, TNS, and DB2 catalog configurations.The existence of this backup copy is crucial before moving on to changes, as a start in the backup process (Figure 8).

Figure 8. Step 1 Backup Job
The logic and configuration stage focuses on aspects of connections to source and target systems, including database and file connections.A detailed explanation is provided regarding the use of ODBC by IBM DataStage to connect to SQL Server, DB2, and various other databases.This information is an important foundation for ETL jobs to be connected correctly to the desired data sources, according to the source documentation.After the testing process is complete, the Running Job ETL stage is carried out to confirm the results of the ETL job, which has been successfully executed (Figure 12).This stage is important to ensure that the converted ETL job runs successfully and produces output according to expectations.Finally, the Data Validation Test stage involves examining the output of each ETL stage, including sums (SUM) and calculations (COUNT) on specific data.Success criteria are determined based on data validation that has been determined by Bank ABC.This stage confirms that the data produced by the ETL system not only works well but is also accurate and consistent with established standards.
DataStage Director Client.The features in this software are used to run, set validation schedules, and monitor ETL jobs.The ability to run and monitor ETL jobs effectively is critical to ensuring the integrity and success of the entire data warehouse update process.Figure 13 shows the user interface of this tool.
DataStage Operations Console.This feature provides visualization related to monitoring job activity, system resources, and workload management queues.The steps for using it are important to ensure that all ETL jobs run smoothly and efficiently.Figure 13 shows the interface of the DataStage Operations Console.

Figure 13. Testing Tool
Test Running.This stage tests the effectiveness of using the IBM DataStage ETL Tools with example queries.Figure 14 shows the query used for testing.In Table 2, we present the results of the effectiveness of IBM DataStage in various environments (DB2, Oracle, and running via DataStage) to evaluate the performance and effectiveness of the tool in different environments.The results of the analysis show that the DWH renewal process was carried out effectively.The changes implemented, including database migration and conversion of ETL jobs from DB2/AS400 to Oracle, have proven effective and efficient.This has been proven through various stages of testing and validation that have been carried out.From Table 2 and Fig. 13, we conducted a test running (using query Fig. 14) data source with 11191 records.It finished with 6709 records correctly and 1 of warning data.The time processing was calculated from running start until finished.The best time is covered by processing DataStage with 11 seconds, DB2 Query with 21 seconds, and Oracle with 36 seconds.

D. Conclusion
The process of updating the Data Warehouse configuration at Bank ABC succeeded in changing the initial configuration involving two sources, namely DB2 and ODS-DWH-Oracle, into a single source of truth by only utilizing the ODS-DWH-Oracle database.This clarifies and simplifies the system configuration structure, increasing clarity and efficiency in data management.Proving the effectiveness of the ETL process using DataStage is important in this update.The time required to run the ETL process is proven to be faster compared to DB2 and Oracle database queries.This conclusion shows that using the DataStage platform provides an advantage in time efficiency of 11 seconds compared to others of 36 and 21 seconds, which is a crucial factor in overall data processing.The IBM DataStage Server role helps optimize the workload on the Bank ABC database.By dividing the workload on the DataStage Server, database performance optimization is achieved, providing better support for data management and consolidation.Thus, this confirms that DataStage integration in a data warehouse environment can have a positive impact on overall system performance, support smooth operations, and minimize the potential for overload on the database server.
The upgrading and updating process of ODS-DWH Oracle with the implementation of Single Environment Staging.opens up the potential for developing better management both technically and conceptually in ETL.First, the size of the DataStage server needs to be adjusted to the workload in order to avoid the potential for server downtime.This is an important key to maintaining smooth operations and system availability.Further development of server infrastructure will support system scalability according to needs that may develop in the future.Furthermore, applying parameterization to ETL jobs is an aspect that can be optimized.This can reduce repetition in the same work process, increase efficiency, and minimize the potential for errors.By using parameterization, ETL jobs can be managed more flexibly and easily adjusted, providing the adaptability needed to address changing needs or specific scenarios.This improvement will support the overall effectiveness and efficiency of the data warehouse system at Bank ABC.

E. Acknowledgment
The author would like to thank to Universitas Bakrie for the support during the research reported in this paper.

Figure 2 .
Figure 2. Short process for updating ETL jobs

Figure 3 .
Figure 3. Implementation of a single source of truth

Figure 9 .Figure 10 .Figure 11 .
Figure 9. Step 1 Import Job Figure 11.Step 3 Import Job The ETL job testing process is a critical step before full implementation into the production Steps in the job import process that have been tested (Figures 9, 10, and 11).Verifying the quality and success of previous ETL jobs executed in a production environment is essential to avoid potential problems.

Figure 12 .
Figure 12.Running an ETL Job

Table 2 .
IBM DataStage Effectiveness Results