diff --git a/docs/content/spark/auxiliary.md b/docs/content/spark/auxiliary.md index 5de0289565f2..a53adfcce100 100644 --- a/docs/content/spark/auxiliary.md +++ b/docs/content/spark/auxiliary.md @@ -61,18 +61,18 @@ SELECT * FROM default.T1 JOIN default.T2 ON xxxx; ``` ## Describe table -DESCRIBE TABLE statement returns the basic metadata information of a table. The metadata information includes column name, column type and column comment. +DESCRIBE TABLE statement returns the basic metadata information of a table or view. The metadata information includes column name, column type and column comment. ```sql --- describe table +-- describe table or view DESCRIBE TABLE my_table; --- describe table with additional metadata +-- describe table or view with additional metadata DESCRIBE TABLE EXTENDED my_table; ``` ## Show create table -SHOW CREATE TABLE returns the CREATE TABLE statement that was used to create a given table. +SHOW CREATE TABLE returns the CREATE TABLE statement or CREATE VIEW statement that was used to create a given table or view. ```sql SHOW CREATE TABLE my_table; @@ -107,6 +107,17 @@ SHOW TABLE EXTENDED IN db_name LIKE 'test*'; SHOW TABLE EXTENDED IN db_name LIKE 'table_name' PARTITION(pt = '2024'); ``` +## Show views +The SHOW VIEWS statement returns all the views for an optionally specified database. + +```sql +-- Lists all views +SHOW VIEWS; + +-- Lists all views that satisfy regular expressions +SHOW VIEWS LIKE 'test*'; +``` + ## Analyze table The ANALYZE TABLE statement collects statistics about the table, that are to be used by the query optimizer to find a better query execution plan. diff --git a/docs/content/spark/sql-ddl.md b/docs/content/spark/sql-ddl.md index 4a82d6b4a438..638a21a7042a 100644 --- a/docs/content/spark/sql-ddl.md +++ b/docs/content/spark/sql-ddl.md @@ -26,7 +26,9 @@ under the License. # SQL DDL -## Create Catalog +## Catalog + +### Create Catalog Paimon catalogs currently support three types of metastores: @@ -36,7 +38,7 @@ Paimon catalogs currently support three types of metastores: See [CatalogOptions]({{< ref "maintenance/configurations#catalogoptions" >}}) for detailed options when creating a catalog. -### Create Filesystem Catalog +#### Create Filesystem Catalog The following Spark SQL registers and uses a Paimon catalog named `my_catalog`. Metadata and table files are stored under `hdfs:///path/to/warehouse`. @@ -56,7 +58,7 @@ After `spark-sql` is started, you can switch to the `default` database of the `p USE paimon.default; ``` -### Creating Hive Catalog +#### Creating Hive Catalog By using Paimon Hive catalog, changes to the catalog will directly affect the corresponding Hive metastore. Tables created in such catalog can also be accessed directly from Hive. @@ -84,13 +86,13 @@ USE paimon.default; Also, you can create [SparkGenericCatalog]({{< ref "spark/quick-start" >}}). -#### Synchronizing Partitions into Hive Metastore +**Synchronizing Partitions into Hive Metastore** By default, Paimon does not synchronize newly created partitions into Hive metastore. Users will see an unpartitioned table in Hive. Partition push-down will be carried out by filter push-down instead. If you want to see a partitioned table in Hive and also synchronize newly created partitions into Hive metastore, please set the table property `metastore.partitioned-table` to true. Also see [CoreOptions]({{< ref "maintenance/configurations#coreoptions" >}}). -### Creating JDBC Catalog +#### Creating JDBC Catalog By using the Paimon JDBC catalog, changes to the catalog will be directly stored in relational databases such as SQLite, MySQL, postgres, etc. @@ -118,7 +120,9 @@ spark-sql ... \ USE paimon.default; ``` -## Create Table +## Table + +### Create Table After use Paimon catalog, you can create and drop tables. Tables created in Paimon Catalogs are managed by the catalog. When the table is dropped from catalog, its table files will also be deleted. @@ -152,7 +156,7 @@ CREATE TABLE my_table ( ); ``` -## Create Table As Select +### Create Table As Select Table can be created and populated by the results of a query, for example, we have a sql like this: `CREATE TABLE table_b AS SELECT id, name FORM table_a`, The resulting table `table_b` will be equivalent to create the table and insert the data with the following statement: @@ -210,8 +214,34 @@ CREATE TABLE my_table_all ( CREATE TABLE my_table_all_as PARTITIONED BY (dt) TBLPROPERTIES ('primary-key' = 'dt,hh') AS SELECT * FROM my_table_all; ``` -## Tag DDL -### Create or replace Tag +## View + +Views are based on the result-set of an SQL query, when using `org.apache.paimon.spark.SparkCatalog`, views are managed by paimon itself. +And in this case, views are supported when the `metastore` type is `hive`, and temporary views are not supported yet. + +### Create Or Replace View + +CREATE VIEW constructs a virtual table that has no physical data. + +```sql +-- create a view. +CREATE VIEW v1 AS SELECT * FROM t1; + +-- create a view, if a view of same name already exists, it will be replaced. +CREATE OR REPLACE VIEW v1 AS SELECT * FROM t1; +``` + +### Drop View + +DROP VIEW removes the metadata associated with a specified view from the catalog. + +```sql +-- drop a view +DROP VIEW v1; +``` + +## Tag +### Create or Replace Tag Create or replace a tag syntax with the following options. - Create a tag with or without the snapshot id and time retention. - Create an existed tag is not failed if using `IF NOT EXISTS` syntax. diff --git a/paimon-spark/paimon-spark-common/src/main/scala/org/apache/paimon/spark/catalyst/analysis/PaimonViewResolver.scala b/paimon-spark/paimon-spark-common/src/main/scala/org/apache/paimon/spark/catalyst/analysis/PaimonViewResolver.scala index a375a296583e..3da0ddab6417 100644 --- a/paimon-spark/paimon-spark-common/src/main/scala/org/apache/paimon/spark/catalyst/analysis/PaimonViewResolver.scala +++ b/paimon-spark/paimon-spark-common/src/main/scala/org/apache/paimon/spark/catalyst/analysis/PaimonViewResolver.scala @@ -24,13 +24,13 @@ import org.apache.paimon.spark.catalog.SupportView import org.apache.paimon.view.View import org.apache.spark.sql.SparkSession -import org.apache.spark.sql.catalyst.analysis.{GetColumnByOrdinal, UnresolvedRelation} -import org.apache.spark.sql.catalyst.expressions.{Alias, UpCast} +import org.apache.spark.sql.catalyst.analysis.{GetColumnByOrdinal, UnresolvedRelation, UnresolvedTableOrView} +import org.apache.spark.sql.catalyst.expressions.{Alias, Attribute, UpCast} import org.apache.spark.sql.catalyst.parser.ParseException import org.apache.spark.sql.catalyst.parser.extensions.{CurrentOrigin, Origin} -import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Project, SubqueryAlias} +import org.apache.spark.sql.catalyst.plans.logical.{LeafNode, LogicalPlan, Project, SubqueryAlias} import org.apache.spark.sql.catalyst.rules.Rule -import org.apache.spark.sql.connector.catalog.PaimonLookupCatalog +import org.apache.spark.sql.connector.catalog.{Identifier, PaimonLookupCatalog} case class PaimonViewResolver(spark: SparkSession) extends Rule[LogicalPlan] @@ -47,6 +47,15 @@ case class PaimonViewResolver(spark: SparkSession) case _: ViewNotExistException => u } + + case u @ UnresolvedTableOrView(CatalogAndIdentifier(catalog: SupportView, ident), _, _) => + try { + catalog.loadView(ident) + ResolvedPaimonView(catalog, ident) + } catch { + case _: ViewNotExistException => + u + } } private def createViewRelation(nameParts: Seq[String], view: View): LogicalPlan = { @@ -83,3 +92,7 @@ case class PaimonViewResolver(spark: SparkSession) } } } + +case class ResolvedPaimonView(catalog: SupportView, identifier: Identifier) extends LeafNode { + override def output: Seq[Attribute] = Nil +} diff --git a/paimon-spark/paimon-spark-common/src/main/scala/org/apache/paimon/spark/execution/PaimonStrategy.scala b/paimon-spark/paimon-spark-common/src/main/scala/org/apache/paimon/spark/execution/PaimonStrategy.scala index 0c3865f7d979..fb7bc6b22cd3 100644 --- a/paimon-spark/paimon-spark-common/src/main/scala/org/apache/paimon/spark/execution/PaimonStrategy.scala +++ b/paimon-spark/paimon-spark-common/src/main/scala/org/apache/paimon/spark/execution/PaimonStrategy.scala @@ -20,13 +20,14 @@ package org.apache.paimon.spark.execution import org.apache.paimon.spark.{SparkCatalog, SparkUtils} import org.apache.paimon.spark.catalog.SupportView +import org.apache.paimon.spark.catalyst.analysis.ResolvedPaimonView import org.apache.paimon.spark.catalyst.plans.logical.{CreateOrReplaceTagCommand, CreatePaimonView, DeleteTagCommand, DropPaimonView, PaimonCallCommand, RenameTagCommand, ResolvedIdentifier, ShowPaimonViews, ShowTagsCommand} import org.apache.spark.sql.{SparkSession, Strategy} import org.apache.spark.sql.catalyst.InternalRow import org.apache.spark.sql.catalyst.analysis.ResolvedNamespace import org.apache.spark.sql.catalyst.expressions.{Expression, GenericInternalRow, PredicateHelper} -import org.apache.spark.sql.catalyst.plans.logical.{CreateTableAsSelect, LogicalPlan} +import org.apache.spark.sql.catalyst.plans.logical.{CreateTableAsSelect, DescribeRelation, LogicalPlan, ShowCreateTable} import org.apache.spark.sql.connector.catalog.{Identifier, PaimonLookupCatalog, TableCatalog} import org.apache.spark.sql.execution.SparkPlan import org.apache.spark.sql.execution.shim.PaimonCreateTableAsSelectStrategy @@ -100,6 +101,12 @@ case class PaimonStrategy(spark: SparkSession) if r.catalog.isInstanceOf[SupportView] => ShowPaimonViewsExec(output, r.catalog.asInstanceOf[SupportView], r.namespace, pattern) :: Nil + case ShowCreateTable(ResolvedPaimonView(viewCatalog, ident), _, output) => + ShowCreatePaimonViewExec(output, viewCatalog, ident) :: Nil + + case DescribeRelation(ResolvedPaimonView(viewCatalog, ident), _, isExtended, output) => + DescribePaimonViewExec(output, viewCatalog, ident, isExtended) :: Nil + case _ => Nil } diff --git a/paimon-spark/paimon-spark-common/src/main/scala/org/apache/paimon/spark/execution/PaimonViewExec.scala b/paimon-spark/paimon-spark-common/src/main/scala/org/apache/paimon/spark/execution/PaimonViewExec.scala index 7a4b907c72f1..2282f7c34411 100644 --- a/paimon-spark/paimon-spark-common/src/main/scala/org/apache/paimon/spark/execution/PaimonViewExec.scala +++ b/paimon-spark/paimon-spark-common/src/main/scala/org/apache/paimon/spark/execution/PaimonViewExec.scala @@ -20,10 +20,12 @@ package org.apache.paimon.spark.execution import org.apache.paimon.spark.catalog.SupportView import org.apache.paimon.spark.leafnode.PaimonLeafV2CommandExec +import org.apache.paimon.view.View import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.catalog.CatalogTable import org.apache.spark.sql.catalyst.expressions.{Attribute, GenericInternalRow} -import org.apache.spark.sql.catalyst.util.StringUtils +import org.apache.spark.sql.catalyst.util.{escapeSingleQuotedString, quoteIfNeeded, StringUtils} import org.apache.spark.sql.connector.catalog.Identifier import org.apache.spark.sql.types.StructType import org.apache.spark.unsafe.types.UTF8String @@ -115,3 +117,116 @@ case class ShowPaimonViewsExec( s"ShowPaimonViewsExec: $namespace" } } + +case class ShowCreatePaimonViewExec(output: Seq[Attribute], catalog: SupportView, ident: Identifier) + extends PaimonLeafV2CommandExec { + + override protected def run(): Seq[InternalRow] = { + val view = catalog.loadView(ident) + + val builder = new StringBuilder + builder ++= s"CREATE VIEW ${view.fullName()} " + showDataColumns(view, builder) + showComment(view, builder) + showProperties(view, builder) + builder ++= s"AS\n${view.query}\n" + + Seq(new GenericInternalRow(values = Array(UTF8String.fromString(builder.toString)))) + } + + private def showDataColumns(view: View, builder: StringBuilder): Unit = { + if (view.rowType().getFields.size() > 0) { + val viewColumns = view.rowType().getFields.asScala.map { + f => + val comment = if (f.description() != null) s" COMMENT '${f.description()}'" else "" + // view columns shouldn't have data type info + s"${quoteIfNeeded(f.name)}$comment" + } + builder ++= concatByMultiLines(viewColumns) + } + } + + private def showComment(view: View, builder: StringBuilder): Unit = { + if (view.comment().isPresent) { + builder ++= s"COMMENT '${view.comment().get()}'\n" + } + } + + private def showProperties(view: View, builder: StringBuilder): Unit = { + if (!view.options().isEmpty) { + val props = view.options().asScala.toSeq.sortBy(_._1).map { + case (key, value) => + s"'${escapeSingleQuotedString(key)}' = '${escapeSingleQuotedString(value)}'" + } + builder ++= s"TBLPROPERTIES ${concatByMultiLines(props)}" + } + } + + private def concatByMultiLines(iter: Iterable[String]): String = { + iter.mkString("(\n ", ",\n ", ")\n") + } + + override def simpleString(maxFields: Int): String = { + s"ShowCreatePaimonViewExec: $ident" + } +} + +case class DescribePaimonViewExec( + output: Seq[Attribute], + catalog: SupportView, + ident: Identifier, + isExtended: Boolean) + extends PaimonLeafV2CommandExec { + + override protected def run(): Seq[InternalRow] = { + val rows = new ArrayBuffer[InternalRow]() + val view = catalog.loadView(ident) + + describeColumns(view, rows) + if (isExtended) { + describeExtended(view, rows) + } + + rows.toSeq + } + + private def describeColumns(view: View, rows: ArrayBuffer[InternalRow]) = { + view + .rowType() + .getFields + .asScala + .map(f => rows += row(f.name(), f.`type`().toString, f.description())) + } + + private def describeExtended(view: View, rows: ArrayBuffer[InternalRow]) = { + rows += row("", "", "") + rows += row("# Detailed View Information", "", "") + rows += row("Name", view.fullName(), "") + rows += row("Comment", view.comment().orElse(""), "") + rows += row("View Text", view.query, "") + rows += row( + "View Query Output Columns", + view.rowType().getFieldNames.asScala.mkString("[", ", ", "]"), + "") + rows += row( + "View Properties", + view + .options() + .asScala + .toSeq + .sortBy(_._1) + .map { case (k, v) => s"$k=$v" } + .mkString("[", ", ", "]"), + "") + } + + private def row(s1: String, s2: String, s3: String): InternalRow = { + new GenericInternalRow( + values = + Array(UTF8String.fromString(s1), UTF8String.fromString(s2), UTF8String.fromString(s3))) + } + + override def simpleString(maxFields: Int): String = { + s"DescribePaimonViewExec: $ident" + } +} diff --git a/paimon-spark/paimon-spark-ut/src/test/scala/org/apache/paimon/spark/sql/PaimonViewTestBase.scala b/paimon-spark/paimon-spark-ut/src/test/scala/org/apache/paimon/spark/sql/PaimonViewTestBase.scala index 39ed8e8a769d..00f5566ed47a 100644 --- a/paimon-spark/paimon-spark-ut/src/test/scala/org/apache/paimon/spark/sql/PaimonViewTestBase.scala +++ b/paimon-spark/paimon-spark-ut/src/test/scala/org/apache/paimon/spark/sql/PaimonViewTestBase.scala @@ -93,4 +93,66 @@ abstract class PaimonViewTestBase extends PaimonHiveTestBase { } } } + + test("Paimon View: show create view") { + sql(s"USE $paimonHiveCatalogName") + withDatabase("test_db") { + sql("CREATE DATABASE test_db") + sql("USE test_db") + withTable("t") { + withView("v") { + sql("CREATE TABLE t (id INT, c STRING) USING paimon") + sql(""" + |CREATE VIEW v + |COMMENT 'test comment' + |TBLPROPERTIES ('k1' = 'v1') + |AS SELECT * FROM t + |""".stripMargin) + + val s = sql("SHOW CREATE TABLE v").collectAsList().get(0).get(0).toString + val r = """ + |CREATE VIEW test_db.v \( + | id, + | c\) + |COMMENT 'test comment' + |TBLPROPERTIES \( + | 'k1' = 'v1', + | 'transient_lastDdlTime' = '\d+'\) + |AS + |SELECT \* FROM t + |""".stripMargin.replace("\n", "").r + assert(r.findFirstIn(s.replace("\n", "")).isDefined) + } + } + } + } + + test("Paimon View: describe [extended] view") { + sql(s"USE $paimonHiveCatalogName") + withDatabase("test_db") { + sql("CREATE DATABASE test_db") + sql("USE test_db") + withTable("t") { + withView("v") { + sql("CREATE TABLE t (id INT, c STRING) USING paimon") + sql(""" + |CREATE VIEW v + |COMMENT 'test comment' + |TBLPROPERTIES ('k1' = 'v1') + |AS SELECT * FROM t + |""".stripMargin) + + checkAnswer(sql("DESC TABLE v"), Seq(Row("id", "INT", null), Row("c", "STRING", null))) + + val rows = sql("DESC TABLE EXTENDED v").collectAsList() + assert(rows.get(3).toString().equals("[# Detailed View Information,,]")) + assert(rows.get(4).toString().equals("[Name,test_db.v,]")) + assert(rows.get(5).toString().equals("[Comment,test comment,]")) + assert(rows.get(6).toString().equals("[View Text,SELECT * FROM t,]")) + assert(rows.get(7).toString().equals("[View Query Output Columns,[id, c],]")) + assert(rows.get(8).toString().contains("[View Properties,[k1=v1")) + } + } + } + } }