-
Notifications
You must be signed in to change notification settings - Fork 63
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
New Lab in Build a Data Lake with Autonomous Data Warehouse workshop (#…
…334) * more updates * updates before review cycle * Update endpoint.png * Update setup-workshop-environment.md * Update setup-workshop-environment.md * more updates * final update before review * updates * replacement code * Update create-share-recipients.md * Update create-share-recipients.md * Update create-share-recipients.md * Update create-share-recipients.md * Update create-share-recipients.md * Update create-share-recipients.md * updates * Update manifest.json * folder rename * added content to data studio folder * Delete user-bucket-credential-diagram.png * updates self-qa * Update introduction.md * remove extra text files * Update introduction.md * Update setup-workshop-environment.md * Data Studio Workshop Changes * changes to data studio workshop * Update setup-workshop-environment.md * adb changes * Update recipient-diagram.png * diagram change * Update user-bucket-credential-diagram.png * SME feedback * Update create-share.md * Nilay changes * changes * Update consume-share.md * Anoosha's feedback * Update consume-share.md * updated 2 screens and a sentence * minor changes * deleted extra images and added doc references * new ECPU changes * more changes to data sharing workshops * more changes to fork (data studio) * more changes * Marty's feedback * Marty's feedback to plsql workshop too * Update setup-workshop-environment.md * Delete 7381.png * workshop # 3 ADB set up and a couple of minor typos in workshops 1 and 2 * changes to adb-dcat workshop * more changes * minor typos in all 4 workshops * quarterly qa build data lake * new lab 11 in build DL with ADW * minor changes database actions drop-down list * final changes to build data lake workshop * AI updates AI workshop updates * ai workshop updates * Update query-using-select-ai.md * Update query-using-select-ai.md * updates * more updates * Update query-using-select-ai.md * more new updates to ai workshop * Update query-using-select-ai.md * a new screen capture * push Marty's feedback to fork Final changes. * updates sandbox manifest * updates * restored sandbox manifest * Update setup-environment.md * updates after CloudWorld * final updates to ai workshop (also new labs 4 and 5) * marty's feedback * incorporated feedback * minor PR edits by Sarah * removed steps 7 & 8 Lab 2 > Task 3 per Alexey The customer asked to remove this as it's not a requirement for the bucket to be public. * more changes * more changes per Alexey * Update load-os-data-public.md * Quarterly QA I added a new step per the PM's request in the Data Sharing PL/SQL workshop. I also made a minor edit (removed space) in the Data Lake workshop. * more updates * Quarterly QA changes * Update consume-share.md * minor edit based on workshop user * quarterly qa November 2023 * Added new videos to the workshop Replaced 3 old silent videos with new ones. Added two new videos. * Adding important notes to the two data sharing workshops Per the PM's request. * folder structure only push to production This push and the PR later is to make sure the folder structure is in the production repo before I start development. Only 1 .md file and the workshops folder. * typos * cloud links workshop * UPDATES * Update query-view.png * update * minor updates to chat ai workshop (Fork) * test clones * test pr * Alexey's feedback * Update data-sharing-diagram.png * sarah's edits * changes to Data Load UI * removed script causing ML issue * Update load-local-data.md * updates: deprecated procedure and new code * updates and test * more updates * minor update * testing using a building block in a workshop * updates * building blocks debugging * Update manifest.json * fixing issues * Update manifest.json * delete cleanup.md from workshop folder (use common file) * use common cleanup.md instead of local cleanup.md * test common tasks * update data sharing data studio workshop * Update create-recipient.png * PM's 1 feedback * quarterly qa * missing "Lab 2" from Manifest * always free note addition added a note * always free change * Update setup-environment.md * update manage and monitor workshop * Folder structure for new data share workshop (plus introduction.md) * Updated Load and Analyze from clone * Data Lake minor changes from clone * manage and monitor workshop * Remove the lab from the workshop per Marty's request * mark-hornick-feedback * used marty's setup file * replaced notebook with a new one * updates to lab 6 of manage and monitor * Update adb-auto-scaling.md * Nilay's feedback * Update adb-auto-scaling.md * updates to second ai workshop * note change * Changes to Load and Analyze workshop (other minor changes too) * quarterly qa * Update diagrams per Alexey (remove delta share icon) * updated the 15-minutes workshop * Update analyzing-movie-sales-data.md * ords updates and misc * updated data studio workshop * ORDS and Misc updates * updated freetier version * updated livelabs version * updating the manage and monitor workshop * more updates * lab 11 updates * updated lab 14 * updated freetier * more updates * Update adw-connection-wallet.md * update * Create purge-scn.png * livelabs updates * Update adb-flashback.md * final updates * updated screens Ramona's review * Update click-add-peer-database-second-time.png * update the adb-dcat workshop 1. New ord 24.1.0 launchpad. 2. New navigation path to create dynamic groups 3. Updated OML UI * Update see-clone-information-in-details-page-2.png * Requested changes to the Data Lake workshop * more updates * updates to Data Lake workshop * kscope24 workshop for Alexey * new lab & other updates * Update load-os-data-private.md * Update load-os-data-private.md * more updates, new lab * minor update example 2 * Update load-os-data-private.md * Chat AI workshop changes * new notebook * more updates * minor updates * New Iceberg Lab added to the freetier and livelabs workshops * Update manifest.json * Update query-iceberg-tables.md --------- Co-authored-by: Michelle Malcher <[email protected]> Co-authored-by: Sarah Hirschfeld <[email protected]>
- Loading branch information
1 parent
d5aacf6
commit 0fcbec2
Showing
20 changed files
with
228 additions
and
16 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file added
BIN
+32.6 KB
shared/adw-data-lake/build-data-lake/query-iceberg-tables/images/add-row.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+106 KB
shared/adw-data-lake/build-data-lake/query-iceberg-tables/images/amazon-aws.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+70.2 KB
shared/adw-data-lake/build-data-lake/query-iceberg-tables/images/click-athena.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+79.2 KB
...data-lake/build-data-lake/query-iceberg-tables/images/create-external-table.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+44.6 KB
shared/adw-data-lake/build-data-lake/query-iceberg-tables/images/create-table.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+35 KB
...d/adw-data-lake/build-data-lake/query-iceberg-tables/images/iceberg-diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+151 KB
...ata-lake/build-data-lake/query-iceberg-tables/images/navigate-sql-worksheet.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+103 KB
shared/adw-data-lake/build-data-lake/query-iceberg-tables/images/query-editor.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+70.2 KB
...data-lake/build-data-lake/query-iceberg-tables/images/query-iceberg-table-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+38.9 KB
...w-data-lake/build-data-lake/query-iceberg-tables/images/query-iceberg-table.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+74.5 KB
...-data-lake/build-data-lake/query-iceberg-tables/images/query-table-athena-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+62.4 KB
...dw-data-lake/build-data-lake/query-iceberg-tables/images/query-table-athena.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+57 KB
shared/adw-data-lake/build-data-lake/query-iceberg-tables/images/sign-in-page.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+93.7 KB
shared/adw-data-lake/build-data-lake/query-iceberg-tables/images/table-created.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+48.2 KB
...d/adw-data-lake/build-data-lake/query-iceberg-tables/images/table-populated.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
189 changes: 189 additions & 0 deletions
189
shared/adw-data-lake/build-data-lake/query-iceberg-tables/query-iceberg-tables.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,189 @@ | ||
# Query Iceberg Tables | ||
|
||
## Introduction | ||
|
||
This lab demonstrates the integration of AWS Athena and Oracle Autonomous Database (ADB). It explores how AWS Athena creates and manages Iceberg tables and how ADB in OCI accesses metadata through AWS Glue. You will learn how to query the Iceberg tables as external tables directly within ADB, using efficient cross-cloud data querying without data replication. | ||
|
||
![Iceberg diagram.](images/iceberg-diagram.png =70%x*) | ||
|
||
Estimated Time: 5 minutes | ||
|
||
<!-- Comments: --> | ||
|
||
### Objectives | ||
|
||
In this lab, we will show you how to do the following: | ||
|
||
* Create and populate an Iceberg table in AWS Athena. | ||
* Create an external table in ADB that will access the Iceberg table from within ADB. | ||
* Add a new row to the Iceberg table and query the updated table in ADB and see the new row of data. | ||
|
||
### Prerequisites | ||
|
||
Access to an ADW and AWS Athena if you choose to perform the steps. | ||
|
||
_**Note:** This is not a hands-on task; instead, it is a demo of how to access Amazon Athena data in Autonomous Database using external tables._ | ||
|
||
### About Querying Apache Iceberg Tables | ||
|
||
Autonomous Database supports querying of Apache Iceberg tables stored in Amazon Web Services (AWS) or in Oracle Cloud Infrastructure (OCI) Object Storage | ||
|
||
### About Amazon Athena | ||
|
||
Amazon Athena is an interactive query service that makes it easy to analyze data directly in Amazon Simple Storage Service (Amazon S3) using standard SQL. With a few actions in the AWS Management Console, you can point Athena at your data stored in Amazon S3 and begin using standard SQL to run ad-hoc queries and get results in seconds. | ||
|
||
### About Apache Iceberg Tables | ||
|
||
Apache Iceberg is a distributed, community-driven, Apache 2.0-licensed, 100% open-source data table format that helps simplify data processing on large datasets stored in data lakes. Data engineers use Apache Iceberg because it is fast, efficient, and reliable at any scale and keeps records of how datasets change over time. Apache Iceberg offers easy integrations with popular data processing frameworks such as Apache Spark, Apache Flink, Apache Hive, Presto, and more. | ||
|
||
<!-- Comments: --> | ||
|
||
## Task 1: Create and Populate an Iceberg Table in AWS Athena | ||
|
||
1. Log in to Amazon AWS. Navigate to https://aws.amazon.com/, and then click **Sign in to the Console**. | ||
|
||
![Navigate to Amazon AWS.](images/amazon-aws.png =70%x*) | ||
|
||
2. On the **Sign in as IAM user** page, enter your Account ID, IAM user name, and Password, and then click **Sign in**. | ||
|
||
![Sign in to Amazon AWS.](images/sign-in-page.png =45%x*) | ||
|
||
3. On the **Console Home** page, click the **Services** menu and then click **Analytics**. Under **Analytics**, click **Athena**. | ||
|
||
![Click Athena.](images/click-athena.png =65%x*) | ||
|
||
4. On the **Athena** home page, click the **Explore the query editor** button. The **Query editor** page is displayed. In the **Tables and views** section, click the **Create** drop-down list, and then select **CREATE TABLE** from the context menu. | ||
|
||
![The Query editor page.](images/query-editor.png =65%x*) | ||
|
||
5. In the Editor, create an Iceberg table named **`movie_promotion_training`** as follows, and then click **Run**. | ||
|
||
``` | ||
CREATE TABLE movie_promotion_training | ||
( id int, movie_name string, is_active boolean, discount int) | ||
LOCATION 's3://iceberg-bkt-us-west-1/movie_promotion' | ||
TBLPROPERTIES ('table_type' = 'ICEBERG') | ||
``` | ||
![Create an Iceberg table.](images/create-table.png =65%x*) | ||
The new table is created and displayed in the list of available tables. | ||
![The Iceberg table is created.](images/table-created.png =65%x*) | ||
6. Populate the **`movie_promotion_training`** table with some data as follows, and then click **Run**. | ||
``` | ||
-- Insert data into the newly created table | ||
INSERT INTO movie_promotion_training (id ,movie_name, is_active, discount) | ||
values | ||
(1, 'Rocky', true, 10), | ||
(2, 'Avatar', true, 10), | ||
(3, 'Big Jake', true, 15); | ||
``` | ||
The new table is populated. | ||
![The Iceberg table is populated.](images/table-populated.png =65%x*) | ||
7. Query the **`movie_promotion_training`** table. The newly added data is displayed. | ||
![Query the Iceberg table.](images/query-table-athena.png =65%x*) | ||
## Task 2: Navigate to the SQL Worksheet and Create an External Table | ||
1. Navigate to the browser tab that displays the **Data Load** page from the previous lab, and then click **Database Actions** in the banner. On the **Launchpad** page, click the **Development** tab, and then click the **SQL** tab. | ||
![Navigate to the SQL Worksheet.](images/navigate-sql-worksheet.png =65%x*) | ||
2. Create an external table named **`movie_promotion_training`**. This table points to the Glue catalog that contains the metadata that points to the actual data stored in Amazon Cloud. Copy and paste the following code into your SQL Worksheet, and then click the **Run Script** icon. | ||
``` | ||
<copy> | ||
BEGIN | ||
DBMS_CLOUD.CREATE_EXTERNAL_TABLE( | ||
table_name => 'movie_promotion_training', | ||
credential_name => 'aws_s3_credential', | ||
file_uri_list => NULL, | ||
format => '{ | ||
"access_protocol": { | ||
"protocol_type": "iceberg", | ||
"protocol_config": { | ||
"iceberg_catalog_type": "aws_glue", | ||
"iceberg_glue_region": "us-west-1", | ||
"iceberg_table_path": "default.movie_promotion_training" | ||
} | ||
} | ||
}' | ||
); | ||
END; | ||
</copy> | ||
``` | ||
![Create external table.](images/create-external-table.png =65%x*) | ||
3. Query the data from the **`movie_promotion_training`** Amazon Iceberg table using the external table. | ||
``` | ||
<copy> | ||
SELECT * | ||
FROM movie_promotion_training; | ||
</copy> | ||
``` | ||
![Query Iceberg table.](images/query-iceberg-table.png =65%x*) | ||
## Task 3: Add a New Row of Data to the Iceberg Table | ||
1. Let's go back to Amazon Athena and add one more row of data to the **`movie_promotion_training`** table. Copy and paste the following in the Query editor in Athena, and then click **Run**. | ||
``` | ||
<copy> | ||
-- Insert one more row into the movie_promotion_training table | ||
INSERT INTO movie_promotion_training (id ,movie_name, is_active, discount) | ||
values | ||
(4, 'The Outlaw Josey Wales', true, 10); | ||
</copy> | ||
``` | ||
![Add one more row into the Iceberg table.](images/add-row.png =65%x*) | ||
2. Query the **`movie_promotion_training`** table again. The newly added row is displayed. | ||
![Query the Iceberg table.](images/query-table-athena-2.png =65%x*) | ||
3. Navigate back to your Autonomous SQL Worksheet. Query the data from the **`movie_promotion_training`** Amazon Iceberg table using the external table. | ||
``` | ||
<copy> | ||
SELECT * | ||
FROM movie_promotion_training; | ||
</copy> | ||
``` | ||
![Query Iceberg table.](images/query-iceberg-table-2.png =65%x*) | ||
The changes to the **`movie_promotion_training`** table in Athena are automatically reflected in Autonomous Database. | ||
## Learn more | ||
* [Getting Started with Amazon Athena](https://docs.aws.amazon.com/athena/latest/ug/getting-started.html) | ||
* [Apache Iceberg](https://iceberg.apache.org/) | ||
You may now proceed to the next lab. | ||
## Acknowledgements | ||
* **Author:** Lauran K. Serhal, Consulting User Assistance Developer | ||
* **Contributor:** Alexey Filanovskiy, Senior Principal Product Manager | ||
* **Last Updated By/Date:** Lauran K. Serhal, June 2024 | ||
Data about movies in this workshop were sourced from Wikipedia. | ||
Copyright (C) 2024 Oracle Corporation. | ||
Permission is granted to copy, distribute and/or modify this document | ||
under the terms of the GNU Free Documentation License, Version 1.3 | ||
or any later version published by the Free Software Foundation; | ||
with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. | ||
A copy of the license is included in the section entitled [GNU Free Documentation License](files/gnu-free-documentation-license.txt) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters