Internal SQL Library

Introduction

The Internal SQL library is a tool to query the content of Java arrays using the SQL language.

You call a java method, named ISQLQuery.query, to which you pass some arrays and an SQL query.
The arrays get grouped in virtual database tables, which are queried with the SQL query.
The result set is converted to arrays, returned by ISQLQuery.query.

The following picture shows how this is done:

Internal SQL Schema
In the figure: When we call the query function:
  1. The SQL query is parsed, building an internal structure. If it contain errors the call is stopped and the errors are returned as exceptions.
  2. The input arrays are applied to the query internal structure.
  3. The output arrays are filled with the return values of the query.

Dependencies

The Internal SQL library uses ANTLR v 3.x to parse the SQL query with the grammar defined in ISQLSelect.g.

How to use it in a program

After linked the library to your project, you need to create an instance of the class ISQLQuery.
The ISQLQuery constructor has two parameters:
Once you created the ISQLQuery object, you call its query method to apply the SQL query to the arrays.
The query method has the following parameters:
The arrays elements are objects (Double, String, Boolean, Timestamp). The reason why we did not use primitive arrays when possible (double[] and boolean[]) is practical: handle the arrays in a generic way, avoiding code duplication.
If your arrays are based on primitives (double[] or boolean[]) you can use ArrayBuilder to convert them to the equivalent object arrays (Double[] or Boolean[]).

The query method returns a map name -> result value.
Again, the name is the same that is used in the query as alias of the result item (after the as word).
The result value is an instance of ITypedValue. It means that it can be an array or a constant. It is a constant if resulting from an expression containing only a constant or an operation of constants.

Limitations

The SQL dialect to write the queries is not standard. These are the differences:
We have these limitations because: It is not excluded that in future versions some of this limitations will be removed.

Query Syntax

For who wants to understand precisely the SQL used for these queries, there is a grammar file, called ISQLSelect.g. Here I give a more human readable explanation. The query is in the following form:

SELECT <expression1> as <alias1>, ... <expressionN> as <aliasN>

FROM [<column1_1>, ... <columnN_1>] as <table1>, ... [<column1_M>, .. <columnP_M>] as <tableM>

WHERE <condition1> [AND|OR <condition2> ... AND|OR <conditionN>]

ORDER BY <column1>, ... <columnN>

GROUP BY <column1>, ... <columnN>

and composed by the following parts:

SELECT clause

The SELECT clause contains the result values of the query. There must be an alias (AS) for each value field.
Each result value can be:

FROM clause

The FROM clause contains the definitions of the tables. Each definition is in the form [<column1>, <column2..>] AS <table>, like in the query in the figure. All columns in a table need to be of the same size. If they have different size, you can distribute them in two or more tables and join them using the WHERE clause.

WHERE clause (optional)

The WHERE clause is, as in the standard SQL, a set of predicates connected by AND and OR. Each predicate is in the form: An expressions in the WHERE clause can be:

ORDER BY clause (optional)

As in the standard SQL, the ORDER BY clause consists of the columns by which the query result has to be sorted. It is in the form: ORDER BY <column1>, <column2>, ... <columnN>

GROUP BY clause (optional)

Also the GROUP BY clause is the same as in the standard SQL. When you have some aggregate function in the SELECT clause (SUM, AVERAGE...) all the columns in the SELECT clause that are not parameters of the average functions must be mentioned in the GROUP BY clause. The GROUP BY clause is in the form: GROUP BY <column1>, <column2>, ... <columnN>

Some additional information.

Constants can be of 4 types, like the arrays.