-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to instantiate a Morel context from an existing Calcite rootSchema
?
#145
Comments
Yes, foreignValue is the way to do it. It associates a Calcite schema with a Morel variable. You could have a single variable named ‘calcite’ for the root schema (and all first-level schemas would appear as fields) or create a separate foreignValue for each first-level schema. You will need some kind of shell. (Parses and executes commands, and prints the results, in the environment created by previous commands in the same session.) Maybe one of the existing shells in Morel will work for you, or could be adapted. The existing shells are able to read commands from a file, including recursively reading files, then switch back to their original input stream, which might be useful to you. I briefly looked at creating a Java wrapper so that Morel could run in Jupyter, and it looked like a pretty similar problem. |
I’m not sure that Calcite’s constructor being protected is a problem. Especially if it is constructed by a shell. One improvement might be for the list of foreignValue to be dynamic, so you can add and remove after the shell has been created. |
Thanks for the pointers Julian. I think I've gotten closer with this, but getting an error about a table not being found. My guess would be that I maybe need to pass in a custom The schema/table structure of my
object CalciteSchemaService {
init {
// ...
addDatabase("chinook", h2DataSource)
}
fun addDatabase(databaseName: String, ds: DataSource): SchemaPlus {
// Inner helper func
fun getSubSchemas(connection: Connection): List<String> {
connection.metaData.schemas.use { rs ->
val schemas = mutableListOf()
while (rs.next()) {
schemas.add(rs.getString("TABLE_SCHEM"))
}
return schemas
}
}
if (getSubSchemas(ds.connection).isNotEmpty()) {
val catalogSchema = JdbcCatalogSchema.create(rootSchema, databaseName, ds, null)
rootSchema.add(databaseName, catalogSchema)
} else {
val jdbcSchema = JdbcSchema.create(rootSchema, databaseName, ds, null, null)
rootSchema.add(databaseName, jdbcSchema)
}
return rootSchema
} class CalciteMorelInterpreter {
static String smlProgram = "chinook;";
static InputStream input = new ByteArrayInputStream(smlProgram.getBytes());
static PrintStream output = System.out;
static SchemaPlus rootSchema = CalciteSchemaService.INSTANCE.getRootSchema();
static SchemaPlus chinookSchema = rootSchema.getSubSchema("chinook").getSubSchema("public");
static Map<String, ForeignValue> foreignValueMap = Calcite
.withDataSets(
Map.of("chinook", (Calcite calcite) -> new CalciteForeignValue(calcite, chinookSchema, true))
)
.foreignValues();
public static void main(String[] args) {
var morelMain = new net.hydromatic.morel.Main(List.of(), input, output, foreignValueMap, new File(""));
morelMain.run();
}
} Exception in thread "main" org.apache.calcite.runtime.CalciteException: Table 'chinook.public.Album' not found
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:505)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:599)
at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:505)
at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:599)
at org.apache.calcite.tools.RelBuilder.scan(RelBuilder.java:1575)
at org.apache.calcite.tools.RelBuilder.scan(RelBuilder.java:1575)
at net.hydromatic.morel.foreign.CalciteForeignValue.lambda$value$4(CalciteForeignValue.java:91)
at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:422)
at com.google.common.collect.RegularImmutableSortedSet.forEach(RegularImmutableSortedSet.java:89)
at net.hydromatic.morel.foreign.CalciteForeignValue.value(CalciteForeignValue.java:89)
at net.hydromatic.morel.compile.Environments.lambda$foreignBindings$2(Environments.java:104)
at com.google.common.collect.SingletonImmutableBiMap.forEach(SingletonImmutableBiMap.java:68)
at net.hydromatic.morel.foreign.CalciteForeignValue.lambda$value$4(CalciteForeignValue.java:91)
at net.hydromatic.morel.compile.Environments.foreignBindings(Environments.java:101)
at net.hydromatic.morel.compile.Environments.env(Environments.java:95)
at net.hydromatic.morel.compile.Environments.env(Environments.java:63)
at net.hydromatic.morel.Main.run(Main.java:122)
at com.github.gavinraydev.CalciteMorelInterpreter.main(Morel.java:33)
at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:422)
at com.google.common.collect.RegularImmutableSortedSet.forEach(RegularImmutableSortedSet.java:89)
at net.hydromatic.morel.foreign.CalciteForeignValue.value(CalciteForeignValue.java:89)
at net.hydromatic.morel.compile.Environments.lambda$foreignBindings$2(Environments.java:104)
at com.google.common.collect.SingletonImmutableBiMap.forEach(SingletonImmutableBiMap.java:68)
at net.hydromatic.morel.compile.Environments.foreignBindings(Environments.java:101)
at net.hydromatic.morel.compile.Environments.env(Environments.java:95)
at net.hydromatic.morel.compile.Environments.env(Environments.java:63)
at net.hydromatic.morel.Main.run(Main.java:122)
at com.github.gavinraydev.CalciteMorelInterpreter.main(Morel.java:33) |
Ah okay, I figured it out. The logic for // Recursively copy sub-schemas from a rootSchema into a Morel Calcite ForeignValue context
private static Calcite createMorelCalciteForeignValueCtx(SchemaPlus rootSchema) {
boolean toLowerCase = true;
final ImmutableMap.Builder<String, DataSet> builder = ImmutableMap.builder();
SchemaPlus currentSchema = rootSchema;
while (true) {
final Set<String> subSchemaNames = currentSchema.getSubSchemaNames();
if (subSchemaNames.isEmpty()) {
break;
}
for (String subSchemaName : subSchemaNames) {
final SchemaPlus subSchema = currentSchema.getSubSchema(subSchemaName);
String name = Schemas.path(subSchema).names().stream().reduce((a, b) -> a + "__" + b).get();
builder.put(name, calcite -> {
SchemaPlus newSchema = calcite.rootSchema.add(name, subSchema);
return new CalciteForeignValue(calcite, newSchema, toLowerCase);
});
}
currentSchema = currentSchema.getSubSchema(subSchemaNames.iterator().next());
}
return Calcite.withDataSets(builder.build());
} static String smlProgram = "from a in chinook__public.artist yield {a.artistid, a.name};";
static Reader input = new InputStreamReader(new ByteArrayInputStream(smlProgram.getBytes()));
static Writer output = new StringBuilderWriter();
public static void main(String[] args) {
Calcite calciteMorelCtx = createMorelCalciteForeignValueCtx(ROOT_SCHEMA);
var morelMain = new Main(List.of(), input, output, calciteMorelCtx.foreignValues(), new File(""));
morelMain.run();
} Result:
val it =
[{artistid=1,name="AC/DC"},{artistid=2,name="Accept"},
{artistid=3,name="Aerosmith"},{artistid=4,name="Alanis Morissette"},
{artistid=5,name="Alice In Chains"},{artistid=6,name="Ant?nio Carlos Jobim"},
{artistid=7,name="Apocalyptica"},{artistid=8,name="Audioslave"},
{artistid=9,name="BackBeat"},{artistid=10,name="Billy Cobham"},
{artistid=11,name="Black Label Society"},{artistid=12,name="Black Sabbath"},
...] : {artistid:int, name:string} list |
Is there a way to get this enhanced functionality into trunk? By which I mean, reframe it as a new feature, then submit a PR that implements the feature, including tests? I certainly think that Regarding the name mapping, to |
Sure, certainly not opposed to doing that.
I think I might try this route, extending the code that already exists May see if there's a way to write a |
Designing for reuse is hard. So I'd probably copy-paste the existing shell, customize it until I had something that works for non-REPL uses, then see whether there's any way to get rid of the copied code (by sub-classing, adding handlers, etc.) Or just start from scratch. If the code seems to be trending towards a shell, let it go in that direction. :) Maybe take inspiration from https://github.com/SpencerPark/IJava (a Java kernel for Jupyter). |
@julianhyde I have created an initial PR here, but it's throwing an error about java.lang.NullPointerException: metadataHandlerProvider
at java.base/java.util.Objects.requireNonNull(Objects.java:233)
at org.apache.calcite.rel.metadata.RelMetadataQueryBase.getMetadataHandlerProvider(RelMetadataQueryBase.java:122)
at org.apache.calcite.rel.metadata.RelMetadataQueryBase.revise(RelMetadataQueryBase.java:118)
at org.apache.calcite.rel.metadata.RelMetadataQuery.collations(RelMetadataQuery.java:604)
at org.apache.calcite.rel.metadata.RelMdCollation.project(RelMdCollation.java:291)
at org.apache.calcite.rel.logical.LogicalProject.lambda$create$0(LogicalProject.java:125)
at org.apache.calcite.plan.RelTraitSet.replaceIfs(RelTraitSet.java:244)
at org.apache.calcite.rel.logical.LogicalProject.create(LogicalProject.java:124)
at org.apache.calcite.rel.logical.LogicalProject.create(LogicalProject.java:114)
at org.apache.calcite.rel.core.RelFactories$ProjectFactoryImpl.createProject(RelFactories.java:178)
at org.apache.calcite.tools.RelBuilder.project_(RelBuilder.java:2025)
at org.apache.calcite.tools.RelBuilder.project(RelBuilder.java:1797)
at net.hydromatic.morel.foreign.CalciteForeignValue.lambda$value$4(CalciteForeignValue.java:100)
at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:422)
at com.google.common.collect.RegularImmutableSortedSet.forEach(RegularImmutableSortedSet.java:89)
at net.hydromatic.morel.foreign.CalciteForeignValue.value(CalciteForeignValue.java:89)
at net.hydromatic.morel.compile.Environments.lambda$foreignBindings$2(Environments.java:104)
at com.google.common.collect.RegularImmutableMap.forEach(RegularImmutableMap.java:292)
at net.hydromatic.morel.compile.Environments.foreignBindings(Environments.java:101)
at net.hydromatic.morel.compile.Environments.env(Environments.java:95)
at net.hydromatic.morel.compile.Environments.env(Environments.java:63)
at net.hydromatic.morel.ProgrammaticShell.makeEnv(ProgrammaticShell.java:80)
at net.hydromatic.morel.ProgrammaticShell.<init>(ProgrammaticShell.java:54)
at net.hydromatic.morel.ProgrammaticShellTest.run(ProgrammaticShellTest.java:19) Would appreciate any feedback if you see what I'm doing wrong |
Ah wow wtf -- removing // BROKEN
static Map<String, ForeignValue> foreignValueMap = Calcite.withDataSets(BuiltInDataSet.DICTIONARY).foreignValues();
// WORKS
Map<String, ForeignValue> foreignValueMap = Calcite.withDataSets(BuiltInDataSet.DICTIONARY).foreignValues(); I guess I don't understand Java? This is very confusing 🤔 |
Yes, foreignValue is the way to do it. It associates a Calcite schema with a Morel variable. You could have a single variable named ‘calcite’ for the root schema (and all first-level schemas would appear as fields) or create a separate foreignValue for each first-level schema.
You will need some kind of shell. (Parses and executes commands, and prints the results, in the environment created by previous commands in the same session.) Maybe one of the existing shells in Morel will work for you, or could be adapted.
The existing shells are able to read commands from a file, including recursively reading files, then switch back to their original input stream, which might be useful to you.
I briefly looked at creating a Java wrapper so that Morel could run in Jupyter, and it looked like a pretty similar problem.
Julian
… On Apr 30, 2022, at 10:54 AM, Gavin Ray ***@***.***> wrote:
Heya Julian,
I'm working on a web playground for Calcite, that I plan on hosting for free so that people can experiment with it. It'll essentially be a web REPL with some pre-loaded data, and the ability to render visualizations of query plans + schema diagrams.
I figured that while I'm at it, it could be nice to add a second input form, which takes Morel expressions, and is bound to the same Calcite context/rootSchema.
Having a bit of difficulty figuring out how to do this though.
I have a singleton object that holds the master schema like below -- I think I need to pass this as a foreignValue to Morel or similar?
// Holds the master schema that contains all datasource schemas and their sub-schemas
object CalciteSchemaService {
val connection: CalciteConnection = initCalciteConnection()
val rootSchema: SchemaPlus = connection.rootSchema
val frameworkConfig: FrameworkConfig = Frameworks.newConfigBuilder()
.defaultSchema(connection.rootSchema)
.parserConfig(SqlParser.config().withCaseSensitive(false))
.build()
val relBuilder: RelBuilder = RelBuilder.create(frameworkConfig)
// <SNIPPED>
}
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.
|
Heya Julian,
I'm working on a web playground for Calcite, that I plan on hosting for free so that people can experiment with it. It'll essentially be a web REPL with some pre-loaded data, and the ability to render visualizations of query plans + schema diagrams.
I figured that while I'm at it, it could be nice to add a second input form, which takes Morel expressions, and is bound to the same Calcite context/
rootSchema
.Having a bit of difficulty figuring out how to do this though.
I have a singleton object that holds the master schema like below -- I think I need to pass this as a
foreignValue
to Morel or similar?I see this but it's
protected
:morel/src/main/java/net/hydromatic/morel/foreign/Calcite.java
Lines 64 to 71 in 0b76fee
The text was updated successfully, but these errors were encountered: