[MAINTENANCE] Rename GE to GX across codebase (GREAT-1352) (great…

…-expectations#6494)
lockettks · Dec 6, 2022 · 7e92319 · 7e92319
1 parent 23f0242
commit 7e92319
Show file tree

Hide file tree

Showing 215 changed files with 799 additions and 817 deletions.
diff --git a/SLACK_GUIDELINES.md b/SLACK_GUIDELINES.md
@@ -4,7 +4,7 @@
 We cannot stress enough that we want this to be a safe, comfortable and inclusive environment. Please read our [code of conduct](https://github.com/great-expectations/great_expectations/blob/develop/CODE_OF_CONDUCT.md) if you need more information on this guideline.
 
 ## Keep timezones in mind and be respectful of peoples’ time.
-People on Slack are distributed and might be in a very different time zone from you, so don't use @channel @here (this is reserved for admins anyways). Before you @-mention someone, think about what timezone they are in and if you are likely to disturb them. You can check someone's timezone in their profile. As of today, the core GE team is based solely in the United States but the community is world wide.
+People on Slack are distributed and might be in a very different time zone from you, so don't use @channel @here (this is reserved for admins anyways). Before you @-mention someone, think about what timezone they are in and if you are likely to disturb them. You can check someone's timezone in their profile. As of today, the core GX team is based solely in the United States but the community is world wide.
 
 If you post in off hours be patient, Someone will get back to you once the sun comes up.
 
@@ -13,7 +13,7 @@ If you post in off hours be patient, Someone will get back to you once the sun c
 - Do your best to try and solve the problem first as your efforts will help us more easily answer the question.
 - [Read "How to write a good question in Slack"](https://github.com/great-expectations/great_expectations/discussions/4951)
 - Head over to our [Documentation](https://docs.greatexpectations.io/en/latest/)
-- Checkout [GitHub Discussions](https://github.com/great-expectations/great_expectations/discussions) this is where we want most of our problem solving, discussion, updates, etc to go because it helps keep a more visible record for GE users.
+- Checkout [GitHub Discussions](https://github.com/great-expectations/great_expectations/discussions) this is where we want most of our problem solving, discussion, updates, etc to go because it helps keep a more visible record for GX users.
 
 #### Asking your question in Slack
 

diff --git a/azure-pipelines-dev.yml b/azure-pipelines-dev.yml
@@ -72,7 +72,7 @@ stages:
                 tests/integration/fixtures/**
                 tests/test_sets/**
 
-                [GEChanged]
+                [GXChanged]
                 great_expectations/**/*.py
                 pyproject.toml
                 setup.cfg
@@ -89,7 +89,7 @@ stages:
 
     jobs:
       - job: lint
-        condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GEChanged'], true)
+        condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GXChanged'], true)
         steps:
           - task: UsePythonVersion@0
             inputs:
@@ -156,10 +156,10 @@ stages:
 
          - script: |
              pip install  .
-           displayName: 'Install GE and required dependencies (i.e. not sqlalchemy)'
+           displayName: 'Install GX and required dependencies (i.e. not sqlalchemy)'
 
          - script: |
-             python -c "import great_expectations as gx; print('Successfully imported GE Version:', gx.__version__)"
+             python -c "import great_expectations as gx; print('Successfully imported GX Version:', gx.__version__)"
            displayName: 'Import Great Expectations'
 
   - stage: required
@@ -170,7 +170,7 @@ stages:
     jobs:
       # Runs pytest without any additional flags
       - job: minimal
-        condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GEChanged'], true)
+        condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GXChanged'], true)
         strategy:
           # This matrix is intended to split up our sizeable test suite into two distinct components.
           # By splitting up slow tests from the remainder of the suite, we can parallelize test runs
@@ -248,7 +248,7 @@ stages:
 
       # Runs pytest with Spark and Postgres enabled
       - job: comprehensive
-        condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GEChanged'], true)
+        condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GXChanged'], true)
         strategy:
           # This matrix is intended to split up our sizeable test suite into two distinct components.
           # By splitting up slow tests from the remainder of the suite, we can parallelize test runs
@@ -323,7 +323,7 @@ stages:
 
     jobs:
       - job: test_usage_stats_messages
-        condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GEChanged'], true)
+        condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GXChanged'], true)
         variables:
           python.version: '3.8'
 
@@ -359,7 +359,7 @@ stages:
 
     jobs:
       - job: mysql
-        condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GEChanged'], true)
+        condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GXChanged'], true)
 
         services:
           mysql: mysql
@@ -416,7 +416,7 @@ stages:
               GE_USAGE_STATISTICS_URL: ${{ variables.GE_USAGE_STATISTICS_URL }}
 
       - job: mssql
-        condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GEChanged'], true)
+        condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GXChanged'], true)
 
         services:
           mssql: mssql
@@ -463,7 +463,7 @@ stages:
               GE_USAGE_STATISTICS_URL: ${{ variables.GE_USAGE_STATISTICS_URL }}
 
       - job: trino
-        condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GEChanged'], true)
+        condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GXChanged'], true)
 
         services:
           trino: trino
@@ -522,7 +522,7 @@ stages:
 
     jobs:
       - job: test_cli
-        condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GEChanged'], true)
+        condition: eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GXChanged'], true)
 
         services:
           postgres: postgres

diff --git a/azure-pipelines.yml b/azure-pipelines.yml
@@ -466,10 +466,10 @@ stages:
 
          - script: |
              pip install  .
-           displayName: 'Install GE and required dependencies (i.e. not sqlalchemy)'
+           displayName: 'Install GX and required dependencies (i.e. not sqlalchemy)'
 
          - script: |
-             python -c "import great_expectations as gx; print('Successfully imported GE Version:', gx.__version__)"
+             python -c "import great_expectations as gx; print('Successfully imported GX Version:', gx.__version__)"
            displayName: 'Import Great Expectations'
 
   - stage: db_integration

diff --git a/azure/user-install-matrix.yml b/azure/user-install-matrix.yml
@@ -20,7 +20,7 @@ jobs:
         - script: |
             great_expectations --version
             great_expectations -y init --no-usage-stats
-            python -c "import great_expectations as gx; print('Successfully imported GE Version:', gx.__version__)"
+            python -c "import great_expectations as gx; print('Successfully imported GX Version:', gx.__version__)"
           displayName: 'Confirm installation'
 
     - job:
@@ -47,5 +47,5 @@ jobs:
             source activate ge_dev
             great_expectations --version
             great_expectations -y init --no-usage-stats
-            python -c "import great_expectations as gx; print('Successfully imported GE Version:', gx.__version__)"
+            python -c "import great_expectations as gx; print('Successfully imported GX Version:', gx.__version__)"
           displayName: 'Confirm installation'
diff --git a/contrib/capitalone_dataprofiler_expectations/README.md b/contrib/capitalone_dataprofiler_expectations/README.md
@@ -38,7 +38,7 @@ If you have suggestions or find a bug, [please open an issue](https://github.com
 
 If you want to install the ml dependencies without generating reports use `DataProfiler[ml]`
 
-If the ML requirements are too strict (say, you don't want to install tensorflow), you can install a slimmer package with `DataProfiler[reports]`. The slimmer package disables the default sensitive data detection / entity recognition (labler) 
+If the ML requirements are too strict (say, you don't want to install tensorflow), you can install a slimmer package with `DataProfiler[reports]`. The slimmer package disables the default sensitive data detection / entity recognition (labler)
 
 Install from pypi: `pip install DataProfiler`
 
@@ -47,7 +47,7 @@ Install from pypi: `pip install DataProfiler`
 
 # What is a Data Profile?
 
-In the case of this library, a data profile is a dictionary containing statistics and predictions about the underlying dataset. There are "global statistics" or `global_stats`, which contain dataset level data and there are "column/row level statistics" or `data_stats` (each column is a new key-value entry). 
+In the case of this library, a data profile is a dictionary containing statistics and predictions about the underlying dataset. There are "global statistics" or `global_stats`, which contain dataset level data and there are "column/row level statistics" or `data_stats` (each column is a new key-value entry).
 
 The format for a structured profile is below:
 
@@ -57,7 +57,7 @@ The format for a structured profile is below:
     "column_count": int,
     "row_count": int,
     "row_has_null_ratio": float,
-    "row_is_null_ratio": float,    
+    "row_is_null_ratio": float,
     "unique_row_ratio": float,
     "duplicate_row_count": int,
     "file_type": string,
@@ -84,11 +84,11 @@ The format for a structured profile is below:
             "null_types_index": {
                 string: list[int]
             },
-            "data_type_representation": dict[string, float], 
+            "data_type_representation": dict[string, float],
             "min": [null, float, str],
             "max": [null, float, str],
             "mode": float,
-            "median": float, 
+            "median": float,
             "median_absolute_deviation": float,
             "sum": float,
             "mean": float,
@@ -98,15 +98,15 @@ The format for a structured profile is below:
             "kurtosis": float,
             "num_zeros": int,
             "num_negatives": int,
-            "histogram": { 
+            "histogram": {
                 "bin_counts": list[int],
                 "bin_edges": list[float],
             },
             "quantiles": {
                 int: float
             },
             "vocab": list[char],
-            "avg_predictions": dict[string, float], 
+            "avg_predictions": dict[string, float],
             "data_label_representation": dict[string, float],
             "categories": list[str],
             "unique_count": int,
@@ -122,7 +122,7 @@ The format for a structured profile is below:
                 'std': float,
                 'sample_size': int,
                 'margin_of_error': float,
-                'confidence_level': float     
+                'confidence_level': float
             },
             "times": dict[string, float],
             "format": string
@@ -180,7 +180,7 @@ The format for an unstructured profile is below:
 * `duplicate_row_count` - the number of rows that occur more than once in the input dataset
 * `file_type` - the format of the file containing the input dataset (ex: .csv)
 * `encoding` - the encoding of the file containing the input dataset (ex: UTF-8)
-* `correlation_matrix` - matrix of shape `column_count` x `column_count` containing the correlation coefficients between each column in the dataset 
+* `correlation_matrix` - matrix of shape `column_count` x `column_count` containing the correlation coefficients between each column in the dataset
 * `chi2_matrix` - matrix of shape `column_count` x `column_count` containing the chi-square statistics between each column in the dataset
 * `profile_schema` - a description of the format of the input dataset labeling each column and its index in the dataset
     * `string` - the label of the column in question and its index in the profile schema
@@ -289,7 +289,7 @@ The format for an unstructured profile is below:
 * BAN (bank account number, 10-18 digits)
 * CREDIT_CARD
 * EMAIL_ADDRESS
-* UUID 
+* UUID
 * HASH_OR_KEY (md5, sha1, sha256, random hash, etc.)
 * IPV4
 * IPV6
@@ -328,7 +328,7 @@ Along with other attributtes the `Data class` enables data to be accessed via a
 
 ```python
 # Load a csv file, return a CSVData object
-csv_data = Data('your_file.csv') 
+csv_data = Data('your_file.csv')
 
 # Print the first 10 rows of the csv file
 print(csv_data.data.head(10))
@@ -346,10 +346,10 @@ print(parquet_data.data.head(10))
 json_data = Data('https://github.com/capitalone/DataProfiler/blob/main/dataprofiler/tests/data/json/iris-utf-8.json')
 ```
 
-If the file type is not automatically identified (rare), you can specify them 
+If the file type is not automatically identified (rare), you can specify them
 specifically, see section [Specifying a Filetype or Delimiter](#specifying-a-filetype-or-delimiter).
 
-### Profile a File 
+### Profile a File
 
 Example uses a CSV file for example, but CSV, JSON, Avro, Parquet or Text should also work.
 
@@ -358,7 +358,7 @@ import json
 from dataprofiler import Data, Profiler
 
 # Load file (CSV should be automatically identified)
-data = Data("your_file.csv") 
+data = Data("your_file.csv")
 
 # Profile the dataset
 profile = Profiler(data)
@@ -395,7 +395,7 @@ Note that if the data you update the profile with contains integer indices that
 
 ### Merging Profiles
 
-If you have two files with the same schema (but different data), it is possible to merge the two profiles together via an addition operator. 
+If you have two files with the same schema (but different data), it is possible to merge the two profiles together via an addition operator.
 
 This also enables profiles to be determined in a distributed manner.
 
@@ -422,8 +422,8 @@ Note that if merged profiles had overlapping integer indices, when null rows are
 
 ### Profiler Differences
 For finding the change between profiles with the same schema we can utilize the
-profile's `diff` function. The diff will provide overall file and sampling 
-differences as well as detailed differences of the data's statistics. For 
+profile's `diff` function. The diff will provide overall file and sampling
+differences as well as detailed differences of the data's statistics. For
 example, numerical columns have a t-test applied to evaluate similarity.
 More information is described in the Profiler section of the [Github Pages](
 https://capitalone.github.io/DataProfiler/).
@@ -463,7 +463,7 @@ print(json.dumps(report["data_stats"][0], indent=4))
 ```
 
 ### Unstructured profiler
-In addition to the structured profiler, DataProfiler provides unstructured profiling for the TextData object or string. The unstructured profiler also works with list[string], pd.Series(string) or pd.DataFrame(string) given profiler_type option specified as `unstructured`. Below is an example of the unstructured profiler with a text file. 
+In addition to the structured profiler, DataProfiler provides unstructured profiling for the TextData object or string. The unstructured profiler also works with list[string], pd.Series(string) or pd.DataFrame(string) given profiler_type option specified as `unstructured`. Below is an example of the unstructured profiler with a text file.
 ```python
 import dataprofiler as dp
 import json
@@ -500,4 +500,4 @@ Authors: Anh Truong, Austin Walters, Jeremy Goodsitt
 The AAAI-21 Workshop on Knowledge Discovery from Unstructured Data in Financial Services
 ```
 
-GE Integration Author: Taylor Turner ([taylorfturner](https://github.com/taylorfturner))
+GX Integration Author: Taylor Turner ([taylorfturner](https://github.com/taylorfturner))
diff --git a/docker/Dockerfile b/docker/Dockerfile
@@ -2,19 +2,19 @@ ARG PYTHON_DOCKER_TAG
 
 FROM python:${PYTHON_DOCKER_TAG}
 
-ARG GE_EXTRA_DEPS="spark,sqlalchemy,redshift,s3,gcp,snowflake"
+ARG GX_EXTRA_DEPS="spark,sqlalchemy,redshift,s3,gcp,snowflake"
 
 ENV PYTHONIOENCODING utf-8
 ENV LANG C.UTF-8
 ENV HOME /root
 ENV PATH /usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:${HOME}/.local/bin
-# Path where the root of the GE project will be expected
-ENV GE_HOME /usr/app/great_expectations
+# Path where the root of the GX project will be expected
+ENV GX_HOME /usr/app/great_expectations
 
 LABEL maintainer="great-expectations"
 LABEL org.opencontainers.image.title="Great Expectations"
 LABEL org.opencontainers.image.description="Great Expectations. Always know what to expect from your data."
-LABEL org.opencontainers.image.version=${GE_VERSION}
+LABEL org.opencontainers.image.version=${GX_VERSION}
 LABEL org.opencontainers.image.created=${CREATED}
 LABEL org.opencontainers.image.url="https://github.com/great-expectations/great_expectations"
 LABEL org.opencontainers.image.documentation="https://github.com/great-expectations/great_expectations"
@@ -29,10 +29,10 @@ COPY . /tmp/great_expectations_install
 
 RUN mkdir -p /usr/app ${HOME} && \
     cd /tmp/great_expectations_install && \
-    pip install .[${GE_EXTRA_DEPS}] && \
+    pip install .[${GX_EXTRA_DEPS}] && \
     rm -rf /tmp/great_expectations_install
 
-WORKDIR ${GE_HOME}
+WORKDIR ${GX_HOME}
 
 ENTRYPOINT ["great_expectations"]
 CMD ["--help"]
diff --git a/...lasses/great_expectations-data_context-data_context-data_context-DataContext.md b/...lasses/great_expectations-data_context-data_context-data_context-DataContext.md
@@ -11,7 +11,7 @@ deployment, with configurations and methods for all supporting components.
 
 The DataContext is configured via a yml file stored in a directory called great_expectations; this configuration
 file as well as managed Expectation Suites should be stored in version control. There are other ways to create a
-Data Context that may be better suited for your particular deployment e.g. ephemerally or backed by GE Cloud
+Data Context that may be better suited for your particular deployment e.g. ephemerally or backed by GX Cloud
 (coming soon). Please refer to our documentation for more details.
 
 You can Validate data or generate Expectations using Execution Engines including:

diff --git a/docs/contributing/style_guides/docs_style.md b/docs/contributing/style_guides/docs_style.md
@@ -30,7 +30,7 @@ This style guide will be enforced for all incoming PRs. However, certain legacy
 :::
 
 
-* The **project name “Great Expectations” is always spaced and capitalized.** Good: “Great Expectations”. Bad: “great_expectations”, “great expectations”, “GE.”
+* The **project name “Great Expectations” is always spaced and capitalized.** Good: “Great Expectations”. Bad: “great_expectations”, “great expectations”, “GX.”
 
 * **We refer to ourselves in the first person plural.** Good: “we”, “our”. Bad: “I”. This helps us avoid awkward passive sentences. Occasionally, we refer to ourselves as “the Great Expectations team” (or community) for clarity.
 

diff --git a/docs/deployment_patterns/how_to_use_great_expectations_in_flyte.md b/docs/deployment_patterns/how_to_use_great_expectations_in_flyte.md
@@ -99,7 +99,7 @@ def file_task(
 
 @workflow
 def file_wf(
-   dataset: CSVFile = "https://raw.githubusercontent.com/superconductive/ge_tutorials/main/data/yellow_tripdata_sample_2019-01.csv",
+   dataset: CSVFile = "https://raw.githubusercontent.com/great-expectations/gx_tutorials/main/data/yellow_tripdata_sample_2019-01.csv",
 ) -> int:
    return file_task(dataset=dataset)
 
@@ -156,7 +156,7 @@ def to_df(dataset: str) -> pd.DataFrame:
 def schema_wf() -> int:
    return schema_task(
        dataframe=to_df(
-           dataset="https://raw.githubusercontent.com/superconductive/ge_tutorials/main/data/yellow_tripdata_sample_2019-01.csv"
+           dataset="https://raw.githubusercontent.com/great-expectations/gx_tutorials/main/data/yellow_tripdata_sample_2019-01.csv"
        )
    )
-Original file line number
+Diff line change
@@ Expand Up @@
     :::
-    * The **project name “Great Expectations” is always spaced and capitalized.** Good: “Great Expectations”. Bad: “great_expectations”, “great expectations”, “GE.”
+    * The **project name “Great Expectations” is always spaced and capitalized.** Good: “Great Expectations”. Bad: “great_expectations”, “great expectations”, “GX.”
     * **We refer to ourselves in the first person plural.** Good: “we”, “our”. Bad: “I”. This helps us avoid awkward passive sentences. Occasionally, we refer to ourselves as “the Great Expectations team” (or community) for clarity.
@@ Expand Down @@