[DB] Import or load data from an external CSV file to a table #680

authorjapps · 2024-09-14T22:32:19Z

AC1:

Given a datafile in CSV format, matching the Database table columns,
provide mechanism to load these data(rows and columns) in the that table.

Example:

Table "players" has three columns as below:

ID:
NAME:
AGE:

and data in CSV is as below:

1001, Ronaldo, 23
1002, Devaldo, 24
1003, Trevaldo, 35

Note- There are no headers present in the above CSV

AC2:

Same as AC1, but with headers.

Example:

and data in CSV is as below:

ID, AGE, NAME <----------- 1st row as header
1001, 23, Ronaldo
1002, 24, Devaldo
1003, 35, Trevaldo

AC3 :

Required for Postgres DB (v9 to v12 should be supported )

AC4:

File could be present in the local files system(eg /test/resources)

optional:
(or in a git repo ==> implement this via a separate ticket.)

AC5 :

{
    "scenarioName": "Sample interaction with the db executor ( data from AC2)",
    "steps": [
        {
            "name": "Loading a csv file",
            "url": "org.jsmart.zerocode.core.db.DbSqlExecutor",
            "operation": "LOADCSV", //<--- in uppercase (for consistency), but support both cases
            "request": {
                "tableName": "players",
                "csvSource": "path/db_sample.csv",
                "withHeaders" : false, //<--false:  this is the default value most cases. (optional)
                "nullString" : "" //<-- used to insert null values, this is the default value (optional)
            },
            "verify": { 
                "size" : 36 //<-- 36 records: number of records loaded. (optional field as usual)
             } 
       },
        {
            "name": "assert_loaded_data_via_select",
            "url": "org.jsmart.zerocode.core.db.DbSqlExecutor",
            "operation": "EXECUTE", //<---- executes an in-line SQL (in this case a SELECT sql)
            "request": {
                "sql": "SELECT id, name, age FROM players WHERE age > ? ORDER BY age ASC",
                "sqlParams" : [ 23 ]
            },
            "verify": {
                "rows.SIZE": 2, //<--- Number of rows in the output (This is OOTB available, don't have to implement)
                "rows": [
                    {
                        "id": 1002,
                        "name": "Devaldo",
                        "age": 24
                    },
                    {
                        "id": 1003,
                        "name": "Trevaldo",
                        "age": 35
                    }
                ]
            }
        }
    ]
}

AC6:

From @javiertuya :
This is a proposal for discussion before starting the implementation:

Where to place the code? Ans: org.jsmart.zerocode.core.db package
Use Apache Commons DbUtils that is a very stable and lightweight library.
This will allow to implement something that does not depend on any particular DBMS.
Add an executor (e.g. DbSqlExecutor) with the operation explained in the example
as well as EXECUTE (or SELECT and UPDATE separately) to handle SQL queries with parameters
Tests would use H2, as it is already included as test scoped dependency.
Postgress requires a container(an example of Docker already present), this should be tested manually, or added to the CI pipeline if required.

More >>

CSV reading will be based/inspired on the domain Parametrized.java class. : What does this mean? Please explain this or point to the PR when implemented.

On the AC6 or AC5, please discuss with @javiertuya if you have any questions or suggestions.

ACRONYMS:

OOTB : Out Of The Box

The text was updated successfully, but these errors were encountered:

nirmalchandra · 2024-09-15T17:05:29Z

Suggestion: (WITHOUT HEADERS)

Try this for "WITH HEADERS": (replace the relevant line)

String sql = "COPY " + tableName + " FROM STDIN WITH (FORMAT csv, HEADER true)";

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.IOException;
import java.sql.Connection;
import java.sql.SQLException;
import java.util.Objects;

import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.stereotype.Service;
import org.springframework.beans.factory.annotation.Autowired;

@Service
public class ImportCsvToPostgres {

    @Autowired
    private JdbcTemplate jdbcTemplate;

    public void importCsvToTable(String fileName, String tableName) {
        Connection connection = null;
        BufferedReader reader = null;

        // COPY command without the HEADER true option
        String sql = "COPY " + tableName + " FROM STDIN WITH (FORMAT csv)";

        try {
            // Load CSV file from resources
            ClassLoader classLoader = getClass().getClassLoader();
            reader = new BufferedReader(new InputStreamReader(
                    Objects.requireNonNull(classLoader.getResourceAsStream(fileName))));

            // Get PostgreSQL connection and execute the COPY command
            connection = jdbcTemplate.getDataSource().getConnection();
            org.postgresql.copy.CopyManager copyManager = connection.unwrap(org.postgresql.PGConnection.class).getCopyManagerAPI();
            copyManager.copyIn(sql, reader);

        } catch (IOException e) {
            throw new RuntimeException("IOException reading the file from resources: " + fileName, e);
        } catch (SQLException e) {
            throw new RuntimeException("SQLException during COPY command execution", e);
        } finally {
            // Close the resources manually //
            if (reader != null) {
                try {
                    reader.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
            if (connection != null) {
                try {
                    connection.close();
                } catch (SQLException e) {
                    e.printStackTrace();
                }
            }
        }
    }
}

authorjapps · 2024-09-17T06:58:24Z

Worth adding to this Page:
https://github.com/authorjapps/zerocode/wiki/Sample-DB-SQL-Executor

javiertuya · 2024-09-27T17:19:00Z

Hi @authorjapps @nirmalchandra, I find very useful the ability to interact with the database. I will like to contribute to this issue.

I have reviewed the repository and docs to get understanding of how zerocode works as it is my first contribution. I noted that the Sample-DB-SQL-Executor depends on Spring and specific DBMSs, and other classes like org.jsmart.zerocode.zerocodejavaexec.DbSqlExecutor.java do not interact with a database. I think that a DBMS vendor independent approach would be worth.

This is a proposal for discussion before starting the implementation:

Use Apache Commons DbUtils that is a very stable and lightweight library.
This will allow to implement something that does not depend on any particular DBMS.
Add an executor (e.g. DbSqlExecutor) whith the loadCsv operation that implements this issue,
as well as executeQuery and executeUpdate to handle SQL queries with parameters
Csv reading will be based/inspired on the domain Parametrized.java class.
Tests would use H2, as it is already included as test scoped dependency. Postgress requires a container, this should be tested manually, or added to the pipeline if required.
Where to place the code? maybe in a org.jsmart.zerocode.core.db namespace?
Below are some examples

{
    "scenarioName": "Sample interaction with the db executor ( data from AC2)",
    "steps": [
        {
            "name": "Loading a csv file",
            "url": "org.jsmart.zerocode.core.db.DbSqlExecutor",
            "operation": "loadCsv",
            "request": {
                "tableName": "players",
                "csvSource": "path/db_sample.csv",
                "noHeaders" : false, //<-- this is the default value, guess that using header may be the most used
                "nullString" : "" //<-- used to insert null values, this is the default value
            },
            "verify": { } //<-- no data returned, log and fail the step if anything fails
       },
        {
            "name": "Read the loaded data using SQL",
            "url": "org.jsmart.zerocode.core.db.DbSqlExecutor",
            "operation": "executeQuery",
            "request": {
                "sqlStatement": "SELECT id, name, age FROM players WHERE age > ? ORDER BY age ASC",
                "sqlParameters" : [ 23 ]
            },
            "verify": {
                "rows.SIZE": 2,
                "rows": [
                    { "id": 1002, "name": "Devaldo", "age": 24 },
                    { "id": 1003, "name": "Trevaldo", "age": 35 }
                ]
            }
        }
    ]
}

authorjapps · 2024-09-28T19:17:15Z

Hi @authorjapps @nirmalchandra, I find very useful the ability to interact with the database. I will like to contribute to this issue.

I have reviewed the repository and docs to get understanding of how zerocode works as it is my first contribution. I noted that the Sample-DB-SQL-Executor depends on Spring and specific DBMSs, and other classes like org.jsmart.zerocode.zerocodejavaexec.DbSqlExecutor.java do not interact with a database. I think that a DBMS vendor independent approach would be worth.

This is a proposal for discussion before starting the implementation:

Use Apache Commons DbUtils that is a very stable and lightweight library.
This will allow to implement something that does not depend on any particular DBMS.

Add an executor (e.g. DbSqlExecutor) whith the loadCsv operation that implements this issue,
as well as executeQuery and executeUpdate to handle SQL queries with parameters

Csv reading will be based/inspired on the domain Parametrized.java class.

Tests would use H2, as it is already included as test scoped dependency. Postgress requires a container, this should be tested manually, or added to the pipeline if required.

Where to place the code? maybe in a org.jsmart.zerocode.core.db namespace?

Below are some examples
{
    "scenarioName": "Sample interaction with the db executor ( data from AC2)",
    "steps": [
        {
            "name": "Loading a csv file",
            "url": "org.jsmart.zerocode.core.db.DbSqlExecutor",
            "operation": "loadCsv",
            "request": {
                "tableName": "players",
                "csvSource": "path/db_sample.csv",
                "noHeaders" : false, //<-- this is the default value, guess that using header may be the most used
                "nullString" : "" //<-- used to insert null values, this is the default value
            },
            "verify": { } //<-- no data returned, log and fail the step if anything fails
       },
        {
            "name": "Read the loaded data using SQL",
            "url": "org.jsmart.zerocode.core.db.DbSqlExecutor",
            "operation": "executeQuery",
            "request": {
                "sqlStatement": "SELECT id, name, age FROM players WHERE age > ? ORDER BY age ASC",
                "sqlParameters" : [ 23 ]
            },
            "verify": {
                "rows.SIZE": 2,
                "rows": [
                    { "id": 1002, "name": "Devaldo", "age": 24 },
                    { "id": 1003, "name": "Trevaldo", "age": 35 }
                ]
            }
        }
    ]
}

@javiertuya 👋 , Thanks for the suggestions! Great thought!

All of it has great details and sounds like great Idea.
Let me capture some ACs from your proposal and put it to the main ticket description, which would be consistent with the codebase and framework.

nirmalchandra · 2024-09-28T19:25:57Z

Also, please take into consideration in this ticket(if possible or in a new ticket later), how to LOAD/IMPORT data or execute code from a private URL(unlike GitHub raw.github... which is not a public URL). That will be of great help to fit more usecaes. Hope this makes sense!

(I think better to separete out this feature into a new ticket and impl in a separte PR)

authorjapps · 2024-10-01T11:21:33Z

AC1:

Given a datafile in CSV format, matching the Database table columns, provide mechanism to load these data(rows and columns) in the that table.

Example:

Table "players" has three columns as below:
ID:
NAME:
AGE:
and data in CSV is as below:
1001, Ronaldo, 23
1002, Devaldo, 24
1003, Trevaldo, 35
Note- There are no headers present in the above CSV

AC2:

Same as AC1, but with headers.

Example:

and data in CSV is as below:
ID, AGE, NAME <----------- 1st row as header
1001, 23, Ronaldo
1002, 24, Devaldo
1003, 35, Trevaldo
AC3 :

Required for Postgres DB (v9 to v12 should be supported )

AC4: (optional)

File could be present in the local files system(eg /test/resources) or in a git repo. Or implement this via a separate ticket.

AC5 :
{
    "scenarioName": "Sample interaction with the db executor ( data from AC2)",
    "steps": [
        {
            "name": "Loading a csv file",
            "url": "org.jsmart.zerocode.core.db.DbSqlExecutor",
            "operation": "LOADCSV", //<--- in uppercase (for consistency), but support both cases
            "request": {
                "tableName": "players",
                "csvSource": "path/db_sample.csv",
                "withHeaders" : false, //<--false:  this is the default value most cases. (optional)
                "nullString" : "" //<-- used to insert null values, this is the default value (optional)
            },
            "verify": { 
                "size" : 36 //<-- 36 records: number of records loaded. (optional field as usual)
             } 
       },
        {
            "name": "assert_loaded_data_via_select",
            "url": "org.jsmart.zerocode.core.db.DbSqlExecutor",
            "operation": "EXECUTE", //<---- executes an in-line SQL (in this case a SELECT sql)
            "request": {
                "sql": "SELECT id, name, age FROM players WHERE age > ? ORDER BY age ASC",
                "sqlParams" : [ 23 ]
            },
            "verify": {
                "SIZE": 2, //<--- Number of rows in the output
                "rows": [
                    { "id": 1002, "name": "Devaldo", "age": 24 },
                    { "id": 1003, "name": "Trevaldo", "age": 35 }
                ]
            }
        }
    ]
}
AC6:

From @javiertuya : This is a proposal for discussion before starting the implementation:

Where to place the code? Ans: org.jsmart.zerocode.core.db package

Use Apache Commons DbUtils that is a very stable and lightweight library.
This will allow to implement something that does not depend on any particular DBMS.

Add an executor (e.g. DbSqlExecutor) with the operation explained in the example
as well as EXECUTE (or SELECT and UPDATE separately) to handle SQL queries with parameters

Tests would use H2, as it is already included as test scoped dependency.

Postgress requires a container(an example of Docker already present), this should be tested manually, or added to the CI pipeline if required.

More >>

CSV reading will be based/inspired on the domain Parametrized.java class. : What does this mean? Please explain this or point to the PR when implemented.

On the AC6 or AC5, please discuss with @javiertuya if you have any questions or suggestions.

@javiertuya , I have captured almost all of it now into the ticket. Please proceed with implementation when ready.
Assigned this ticket to you.

(mainly AC5, AC6 with minor updates)

javiertuya · 2024-10-01T16:36:39Z

@authorjapps Great, working on it.

More >>

CSV reading will be based/inspired on the domain Parametrized.java class. : What does this mean? Please explain this or point to the PR when implemented.

I was referring to using the same approach to get the content of the csvSource. This could result in some code to refactor later.

authorjapps · 2024-10-01T20:57:25Z

@authorjapps Great, working on it.

More >>

CSV reading will be based/inspired on the domain Parametrized.java class. : What does this mean? Please explain this or point to the PR when implemented.

I was referring to using the same approach to get the content of the csvSource. This could result in some code to refactor later.

Oh, ok.
I think it's better to render the output "rows[]" as simple JSON like below
instead of { "id": 1002, "name": "Devaldo", "age": 24 } in the same line.

This will make the assertions mechanism very easy and natural (also no additional overhead to implement this too imo )

Also, for this you non't need to implement Parametrized etc.

"rows": [
                    {
                        "id": 1002,
                        "name": "Devaldo",
                        "age": 24
                    },
                    {
                        "id": 1003,
                        "name": "Trevaldo",
                        "age": 35
                    }
]

javiertuya · 2024-10-02T16:01:38Z

@authorjapps

"operation": "LOADCSV", //<--- in uppercase (for consistency), but support both cases

I found that the JavaMethodExecutor is case sensitive when matching the operation methods. It seems that it is a single line change. Should I submit an issue+PR, just a new PR, or include in this PR (in a separate commit)?

authorjapps · 2024-10-03T10:48:09Z

@authorjapps

"operation": "LOADCSV", //<--- in uppercase (for consistency), but support both cases

I found that the JavaMethodExecutor is case sensitive when matching the operation methods. It seems that it is a single line change. Should I submit an issue+PR, just a new PR, or include in this PR (in a separate commit)?

You don't have to change in the JavaMethodExecutor.

You can create something like below, a passthrough mechanism in the DBExecutor which you're going to impl.
Both, with public visibility should satisfy the AC.

LOADCSV(...){
   loadcsv(...);
}

loadcsv(...){
   ...
}

authorjapps · 2024-10-19T15:27:27Z

The documentation PR is here :
authorjapps/zerocode-tdd-docs#24

authorjapps added the feature-request New feature which can be introduced label Sep 14, 2024

authorjapps assigned javiertuya Oct 1, 2024

javiertuya linked a pull request Oct 8, 2024 that will close this issue

ISSUE-680 # Add the DB SQL Executor to import data from CSV and execute SQL statements #686

Open

17 tasks

authorjapps linked a pull request Oct 19, 2024 that will close this issue

Add Doc for the DB SQL Executor authorjapps/zerocode-tdd-docs#24

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DB] Import or load data from an external CSV file to a table #680

[DB] Import or load data from an external CSV file to a table #680

authorjapps commented Sep 14, 2024 •

edited by nirmalchandra

Loading

nirmalchandra commented Sep 15, 2024 •

edited

Loading

authorjapps commented Sep 17, 2024

javiertuya commented Sep 27, 2024

authorjapps commented Sep 28, 2024

nirmalchandra commented Sep 28, 2024

authorjapps commented Oct 1, 2024

AC1:

AC2:

AC3 :

AC4: (optional)

AC5 :

AC6:

javiertuya commented Oct 1, 2024

authorjapps commented Oct 1, 2024 •

edited

Loading

javiertuya commented Oct 2, 2024

authorjapps commented Oct 3, 2024 •

edited

Loading

authorjapps commented Oct 19, 2024

[DB] Import or load data from an external CSV file to a table #680

[DB] Import or load data from an external CSV file to a table #680

Comments

authorjapps commented Sep 14, 2024 • edited by nirmalchandra Loading

AC1:

AC2:

AC3 :

AC4:

AC5 :

AC6:

ACRONYMS:

nirmalchandra commented Sep 15, 2024 • edited Loading

Suggestion: (WITHOUT HEADERS)

authorjapps commented Sep 17, 2024

javiertuya commented Sep 27, 2024

authorjapps commented Sep 28, 2024

nirmalchandra commented Sep 28, 2024

authorjapps commented Oct 1, 2024

AC1:

AC2:

AC3 :

AC4: (optional)

AC5 :

AC6:

javiertuya commented Oct 1, 2024

authorjapps commented Oct 1, 2024 • edited Loading

javiertuya commented Oct 2, 2024

authorjapps commented Oct 3, 2024 • edited Loading

authorjapps commented Oct 19, 2024

authorjapps commented Sep 14, 2024 •

edited by nirmalchandra

Loading

nirmalchandra commented Sep 15, 2024 •

edited

Loading

authorjapps commented Oct 1, 2024 •

edited

Loading

authorjapps commented Oct 3, 2024 •

edited

Loading