Pentaho Data Integration
InstallationBusiness AnalyticsCToolsData CatalogData QualityLLMs
  • Overview
    • Pentaho Data Integration ..
  • Data Integration
    • Getting Started
      • Configuring PDI UI
      • KETTLE Variables
    • Concepts & Terminolgy
      • Hello World
      • Logging
      • Error Handling
    • Data Sources
      • Flat Files
        • Text
          • Text File Input
          • Text File Output
        • Excel
          • Excel Writer
        • XML
          • Read XML
        • JSON
          • Read JSON
      • Databases
        • CRUID
          • Database Connections
          • Create DB
          • Read DB
          • Update DB
          • Insert / Update DB
          • Delete DB
        • SCDs
          • SCDs
      • Object Stores
        • MinIO
      • SMB
      • Big Data
        • Hadoop
          • Apache Hadoop
    • Enrich Data
      • Merge
        • Merge Streams
        • Merge Rows (diff)
      • Joins
        • Cross Join
        • Merge Join
        • Database Join
        • XML Join
      • Lookups
        • Database Lookups
      • Scripting
        • Formula
        • Modified JavaScript Value
        • User Defined Java Class
    • Enterprise Solution
      • Jobs
        • Job - Hello World
        • Backward Chaining
        • Parallel
      • Parameters & Variables
        • Parameters
        • Variables
      • Scalability
        • Run Configurations
        • Partition
      • Monitoring & Scheduling
        • Monitoring & Scheduling
      • Logging
        • Logging
      • Dockmaker
        • BA & DI Servers
      • Metadata Injection
        • MDI
    • Plugins
      • Hierarchical Data Type
  • Use Cases
    • Streaming Data
      • MQTT
        • Mosquitto
        • HiveMQ
      • AMQP
        • RabbitMQ
      • Kafka
        • Kafka
    • Machine Learning
      • Prerequiste Tasks
      • AutoML
      • Credit Card
    • RESTful API
    • Jenkins
    • GenAI
  • Reference
    • Page 1
Powered by GitBook
On this page
  1. Data Integration
  2. Data Sources

Databases

Database Opertions (CRUID)

PreviousRead JSONNextCRUID

Last updated 1 month ago

Introduction

Steel Wheels utilizes a straightforward Enterprise Resource Platform (ERP) to manage various Business Units (BUs) including Human Resources, Marketing, Finance, Supply Chain, and others. The upcoming workshops will demonstrate steps that illustrate CRUD (Create, Read, Update, Insert, Delete) operations:

Create operations add new records to a database. This might involve inserting a new customer profile, product listing, or transaction record. In SQL, this is typically done using the INSERT statement, while in NoSQL databases, it might use methods like insertOne() or save().

Read operations retrieve existing data from the database. This could be fetching a single record by its unique identifier or querying multiple records based on specific criteria. SQL uses SELECT statements for this purpose, while NoSQL databases might use find() or get() methods.

Update operations modify existing records in the database. This could involve changing a customer's address, updating a product's price, or modifying any stored information. SQL uses the UPDATE statement, while NoSQL databases might use methods like updateOne() or save() on an existing document.

Delete operations remove records from the database. This might involve permanently removing a user account or archiving old data. SQL uses the DELETE statement, while NoSQL databases typically use methods like deleteOne() or remove().

Slowly Changing Dimensions

Slowly Changing Dimensions (SCDs) are a data warehousing concept that addresses how to handle changes to dimension data over time. In a data warehouse, dimensions contain descriptive attributes that are used for reporting and analysis.

Key aspects of SCDs include:

  • Type 1: Simply overwrites old data with new data, keeping no history

  • Type 2: Maintains historical data by adding new rows with effective dates/flags

  • Type 3: Adds previous value columns to track limited history

  • Type 4: Uses separate historical tables to store changes

  • Type 6: Combines approaches (typically Types 1, 2, and 3)

SCDs are crucial for maintaining data integrity in business intelligence systems when dimensional attributes change, allowing organizations to accurately analyze data across different time periods.