Skip to content
This repository has been archived by the owner on Jun 7, 2021. It is now read-only.

TRAFODION-2957 one stmt support multi-queries #1535

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mashengchen
Copy link
Contributor

  1. reconstitute the constructor of TrafT4Statement & TrafT4PreparedStatement
  2. add new property allowMultiQueries to support exec multi-queries in one statement
  3. new class extends from TrafT4Statement & TrafT4PreparedStatement to exec multi-queries

@Traf-Jenkins
Copy link

Check Test Started: https://jenkins.esgyn.com/job/Check-PR-master/2582/

@Traf-Jenkins
Copy link

@moscowgentalman
Copy link
Contributor

I think you should add license header into new java files.

@zellerh
Copy link
Contributor

zellerh commented Apr 24, 2018

Can you explain a bit how this will work at the JDBC interface level? Will a user pass a statement like this to Connection::prepareStatement:

conn.prepareStatement("begin insert into t1 values(?,?); delete from t2; insert into t2 values (?); end");

If/when we resurrect the Trafodion code to support compound statements, will the interface work the same?

null);
}

sqlArr = sql.split(";");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if there is ';' in the character literal of the sql query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there will have syntax error.
eg: insert into tbl values ('a;a'); insert into tbl2 values ('b')
there will have 3 preparedStatments, each one do itself prepare, then there will throw exception when do insert into tbl values ('a
Actually this is a problem, i don't have a good solution now, maybe you can give some suggestions, thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could search for the standalone ';' and ignore ';' when it is within a single quote.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems very prone to opening application to sql code insertion attacks. We have to be VERY careful not to allow any literal string to be executed as sql code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per specification we should be allowing literals as part of the SQL. I guess Trafodion SQL engine shouldn't be allowing the variables like $PARAM1 to be substituted on the fly in the query. If Trafodion did, it should be validated sufficiently not to allow the manipulation of the query via these variables

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@svarnau @selvaganesang actually, what TrafT4MultiQueriesPreparedStatement do is just TrafT4PreparedStatement do, so if TrafT4PreparedStatement can avoid insertion attacks or variables like $PARAM1, then TrafT4MultiQueriesPreparedStatement can

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you @mashengchen. But my suggestion is to take care of ';' when it appears in the midst of string literal in the param. Do you think if that suggestion would hold good? If so, do you think it is better to make that change

@selvaganesang
Copy link
Contributor

I am not clear on the application interface to support this feature. Are we trying to conform to mysql way as given in this link https://stackoverflow.com/questions/10797794/multiple-queries-executed-in-java-in-single-statement

@mashengchen
Copy link
Contributor Author

@selvaganesang yes, the same feature

@mashengchen
Copy link
Contributor Author

mashengchen commented Apr 24, 2018

@zellerh
seems you lose a ";" after begin, if no ";" there will throw syntax error.
so if query like "begin; insert into t1 values(?,?); delete from t2; insert into t2 values (?); end", actually there will have 5 preparedStatements , each do one query.

@Traf-Jenkins
Copy link

New Check Test Started: https://jenkins.esgyn.com/job/Check-PR-master/2587/

@zellerh
Copy link
Contributor

zellerh commented Apr 24, 2018

@mashengchen, I was thinking of a compound statement in my example. Your reply and the link that @selvaganesang provided helped me understand better. Trying to summarize what I heard, what you are implementing is a way to run multiple statements, in the following way:

  • Users need to specify an option in the connection string to enable this feature.
  • Users need to specify multiple statements, separated by semicolons (no begin/end block).
  • This feature does not implement a compound statement (begin end block).

If that's true, then that sounds ok to me. I just wanted to make sure we can add compound statements like in MySQL or IBM DB2 later without running into trouble with your feature.

@Traf-Jenkins
Copy link

@mashengchen
Copy link
Contributor Author

@zellerh it's right in your summary, this PR is just a simple implement of the feature, the fully implement should change the sql engine code, which i am not familiar with it.

}
}

public boolean execute(String sql) throws SQLException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is invalid in PreparedStatement. A preparedStatement can't execute different SQL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benefit of preparedstatements is to prepare the query once and to execute multiple times. By breaking each semicolon separated sql into its own preparestatement, are we losing out on this feature?

Would it be more appropriate to implement this feature for dynamic statements "Statement" as opposed to "PreparedStatement" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@selvaganesang
i am not sure whether user will do execute(sql) in a preparedStmt. but this api is valid in JKD, it's inherted from java.sql.statement, if it's invalid in trafodion, i can remove it.

@venkat1m there may have possible that user want to exec "insert into t values (?); insert into t2 values(?,?)" in one preparedstatement to avoid sql code insertion attacks, in this PR, it will separate to 2 preparedStmts, and each one can execute multi-times. so the prepare is needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be valid from interface inheritance point of view. However, the JDBC specification clearly says the following

boolean execute(String sql)
throws SQLException
Note:This method cannot be called on a PreparedStatement or CallableStatement.

It is not a Trafodion restriction. It is against the intent of PreparedStatement functionality as per the specification,

All other flavors of execute call in Statement object can't be called in PreparedStatement or CallableStatement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@selvaganesang so your suggestion is to remove the method or throw exception when user invoke the method?
i see TrafT4PreparedStatement not mark the methods( execute(sql) ) as not support

@selvaganesang
Copy link
Contributor

You should ensure that all the methods of the standard PreparedStatement and Statement work seamlessly with the multi-queries statements. If a particular method doesn't make sense or can't work correctly, then it is better to return an error. For eg. addBatch, executeBatch etc.

In case of select statement, I couldn't figure out how result sets from different select statements are interfacing with the application.

The transaction boundaries needs to be explained too for both autocommit and non-autocommit mode.

connection_.props_.t4Logger_.logp(Level.FINE, "TrafT4MultiQueriesPreparedStatement", "close", "", p);
}
for (int i = 0; i < pstmtArr.length; i++) {
pstmtArr[i].close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be helpful to add a try catch around here and close as many statements as possible and throw the first exception, so as to free resources. For example I have 5 statements and 2nd one fails, the remaining 3 also will not get closed.

Copy link
Contributor Author

@mashengchen mashengchen May 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very well suggestions, thanks

connection_.props_.t4Logger_.logp(Level.FINE, "TrafT4MultiQueriesPreparedStatement", "cancel", "", p);
}
for (int i = 0; i < pstmtArr.length; i++) {
pstmtArr[i].cancel();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be helpful to add a try catch around here and cancel as many statements as possible and throw the first exception. For example I have 5 statements and cancel of 2nd stmt fails, the remaining 3 also will not get cancelled.

@Traf-Jenkins
Copy link

New Check Test Started: https://jenkins.esgyn.com/job/Check-PR-master/2610/

@Traf-Jenkins
Copy link

@Traf-Jenkins
Copy link

Previous Test Aborted. New Check Test Started: https://jenkins.esgyn.com/job/Check-PR-master/2611/

@Traf-Jenkins
Copy link

@selvaganesang
Copy link
Contributor

It looks like the weakReference of Statement object is removed. The weakReference is used to implicitly close the statement to clean up the resources associated with the Statement object if the application doesn't explicitly close the statement. This is to take care of resource leaks in the SQL side. Are there any other alternatives added to take care of such resource leaks. If not, I would suggest this PR may not be merged.

@mashengchen
Copy link
Contributor Author

@selvaganesang the weakReference still exist, I just refactor the construction of TrafT4Statement & TrafT4PreparedStatement, each time invoke TrafT4MultiQueriesStatement or TrafT4MultiQueriesPreparedStatement or TrafT4PreparedStatement, there will invoke the construction of TrafT4Statement, in which there will make the statement weakReference.
see line 1415 of TrafT4Statement, it's just fold in the page.

@selvaganesang
Copy link
Contributor

Thanks @mashengchen. Now I get it. You have refactored to call another constructor within a constructor. I remember there was one constructor of T4Statement (that is called internally) where the weakReference shouldn't be constructed. I am hoping that remains the same way even now.

@mashengchen
Copy link
Contributor Author

@selvaganesang i think what you descript is TrafT4Statement(); i did not change the behavior that.
also i check all constructor, and there is no add behavior and no lose behavior.

@Traf-Jenkins
Copy link

New Check Test Started: https://jenkins.esgyn.com/job/Check-PR-master/2638/

@Traf-Jenkins
Copy link

@Traf-Jenkins
Copy link

New Check Test Started: https://jenkins.esgyn.com/job/Check-PR-master/2643/

@Traf-Jenkins
Copy link

@DaveBirdsall
Copy link
Contributor

@selvaganesang and @zellerh, does this change look ready to merge now?

@zellerh
Copy link
Contributor

zellerh commented May 15, 2018

All the comments I had are answered, so don't have any objections.

@selvaganesang
Copy link
Contributor

I am ok with merging this PR but the following needs to be addressed soon.

  1. str.split(';') could split at the wrong place if there is ';' in string literal in the given sql string. This could result in unintended syntax error.
  2. Transaction semantics needs to be clarified.

I had raised these concerns earlier too

@svarnau
Copy link
Contributor

svarnau commented May 15, 2018

I'm concerned that the issue Selva points out (incorrect parsing of semicolons) is not just a potential syntax error, but a potential security vulnerability -- via SQL injection into string values. If so, then "fix it later" is not good enough. It needs to be fixed before this is merged.

@DaveBirdsall
Copy link
Contributor

Thanks @svarnau for pointing that out. I didn't sufficiently pick that up out of @selvaganesang's comments. I agree: the parsing issue needs to be fixed to avoid SQL injection vulnerabilities.

@zellerh
Copy link
Contributor

zellerh commented May 15, 2018

Thanks, very good point! That always makes me think of this cartoon: https://xkcd.com/327/.

@mashengchen
Copy link
Contributor Author

@selvaganesang about your concerns

  1. it's a question
  2. i think transaction is no need, as i know each query in query1;query2;query3;query4 are not tranactional related. if there needed , user can use conn.setAutoCommit(false); then use conn.commit();

@mashengchen
Copy link
Contributor Author

@svarnau can you give an example for SQL injection. actually, after split(";"), each query will invode statement or preparedstatement by user define, if use statement there will meet SQL injection, but there certainly will meet, to avoid this ,we should use preparedStatement.
actually user can only know he/she use preparedStatement by using conn.prepareStatement(sql); in our code it will create multi prepareStatement for each query, so i can't image how to meet a sql inject

@mashengchen
Copy link
Contributor Author

@svarnau
after open allowMultiQueries, coder can use conn.preparedStatement("select * from A where a=?;select * from B where b=?"); to do prepare in one statement.
For each query there acutally create a separated pstmt, that means there will still use preparedStatement, so if user do sql inject, and set 'A;drop table A' , our sql engine will find out this inject i think , and throw exception.
for conn.createStatement("select * from A where a=" +"A;drop table A" );. I think even close allowMultiQueries there will have problem

@selvaganesang
Copy link
Contributor

Thanks @svarnau for bringing this up.. Yes, I agree it is a potential security vulnerabilty

@svarnau
Copy link
Contributor

svarnau commented May 16, 2018

@mashengchen - Not sure I can craft a convincing exploit, just saying the potential is there. The statements need to be parsed more carefully to be on the safe side.

@selvaganesang
Copy link
Contributor

selvaganesang commented May 17, 2018

Try this query select * from "A\"; drop table A;"
or
select * from a "A\"; drop table a ;"

@mashengchen
Copy link
Contributor Author

@selvaganesang yes, but the user can't run this query in the app, only the programmer can code the query.
and if it's the inject in application, it should be a param to send to a query like select * from user where username = ?; select * from auth where username=?
then there will do preparedstatement be split ';' and use setXXX to relpace '?', in this time there will be like setXXX(1, 'select * from "A"; drop table A;"') in your case, and of course the engine will throw exception.
this change is just separate each query (programmers define , not app user define), and do each query what they should do means if parparedstatement then preparedstatement for each query, if statement then statement for each query.

@selvaganesang
Copy link
Contributor

Yes. It is true that programmer needs to code this query. Consider an application where user can fill up either the full query or parts of the query and the application just passes the query thus formed to Trafodion SQL engine . The proper processing of the ';' like splitting into multiple statements only when the ';' is outside of a single or double quotes would help to restrict such kind of injection attacks. It is not clear to me what is the resistance to process the split character ';' properly here?

@zellerh
Copy link
Contributor

zellerh commented May 21, 2018

In addition to a possible SQL injection attack, this feature also just won't work if there is a semicolon in a quoted string. Agree with @selvaganesang that we should fix this and split the statements correctly even with semicolons in single or double-quoted strings. Example: select a, b, "tl;dr" from t where b = 'a;b;c'; values(10);

@Traf-Jenkins
Copy link

Check Test Started: https://jenkins.esgyn.com/job/Check-PR-master/2886/

@Traf-Jenkins
Copy link

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants