Data Sources

Flat Files & Databases ..

Let's turn our attention toward two most common data sources:

  • Flat files are simple text files that contain data, while databases are organized collections of data that can be accessed, managed, and updated easily. To access flat files, you can use a text editor or a spreadsheet program like Microsoft Excel. You can also use programming languages like Python to read and write data from flat files.

  • To access databases, you need to use a database management system (DBMS) like MySQL, Oracle, or Microsoft SQL Server. You can use the Database Connection Wizard in Pentaho to connect to a database and retrieve data from it .

Structured data is considered the most traditional form of data storage, as early database management systems (DBMS) were designed to handle this format. This type of data relies on a predefined data model, which outlines how the data is stored, processed, and accessed. The model ensures each piece of data, or field, is distinct, enabling targeted or comprehensive queries across multiple data points. This feature makes structured data exceptionally versatile, allowing for efficient aggregation of information from different database segments.

At its core, structured data follows a specific format, making it easily analyzable. It fits into a tabular structure, with clear relationships between the rows and columns, similar to those found in Excel spreadsheets or SQL databases. These containers organize data into defined rows and columns, facilitating straightforward sorting and manipulation.

Last updated