Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kendra] Jira connector not syncing #4287

Closed
1 task
ssmails opened this issue Sep 30, 2024 · 13 comments
Closed
1 task

[Kendra] Jira connector not syncing #4287

ssmails opened this issue Sep 30, 2024 · 13 comments
Labels
bug This issue is a confirmed bug. kendra p2 This is a standard priority issue service-api This issue is caused by the service API, not the SDK implementation.

Comments

@ssmails
Copy link

ssmails commented Sep 30, 2024

Describe the bug

Created JIRA connector following this documentation https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/create_data_source.html

datasource created, but the sync fails. No logs on kendra for the error.
create and sync from kendra UI with same configs as the above failed case, works ok. so seem like boto3 kendra compatibility issue.

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

jira sync should work

Current Behavior

datasource created, but the sync fails. No logs on kendra for the error.
create and sync from kendra UI with same configs as the above failed case, works ok. so seem like boto3 kendra compatibility issue.

Reproduction Steps

    def create_new_data_source_jira(self, index_id: str):
        """
        Creates a new Kendra data source based on the configuration provided in the YAML file.

        Returns:
        str: The ID of the created data source, or the ID of the existing data source if one already exists.
        """
        data_source_config = self.config['data_source']

        logger.info(f"role ARN={data_source_config['configuration']['role_arn']}")
        logger.info(f"indexid={data_source_config['indexid']}")

        try:
            response = self.kendra.create_data_source(
                RoleArn=data_source_config['configuration']['role_arn'],
                Name=data_source_config['name'],
                IndexId=data_source_config['indexid'],
                Type=data_source_config['type'],
                Configuration={
                    'JiraConfiguration': {
                        'JiraAccountUrl': 'working jira url which works from kendra ui',
                        'SecretArn': 'working secret art which works from kendra ui',
                        'IssueSubEntityFilter': ['COMMENTS','ATTACHMENTS','WORKLOGS'],
                        'IssueType': ['BUG','STORY','TASK','EPIC'],
                        'AttachmentFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                        'CommentFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                        'IssueFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                        'ProjectFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                        'WorkLogFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                    }
                }
            )

            data_source_id = response['Id']
            logger.info(f"Data source created with ID: {data_source_id}")

            return data_source_id
        except Exception as e:
            logger.error(f"Error creating data source: {str(e)}")
            raise KendraAdapterException(f"Error creating data source: {str(e)}")
    def start_ingestion_jira(self, index_id, data_source_id):
        try:
            response = self.kendra.start_data_source_sync_job(
                Id=data_source_id,
                IndexId=index_id
            )

            sync_job_id = response['ExecutionId']
            logger.info(f"Data source sync job started with ID: {sync_job_id}")
        except Exception as e:
            logger.error(f"Error starting data source sync job: {str(e)}")
            raise KendraAdapterException(f"Error starting data source sync job: {str(e)}")

Possible Solution

No response

Additional Information/Context

No response

SDK version used

1.35.16

Environment details (OS name and version, etc.)

Mac

@ssmails ssmails added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Sep 30, 2024
@tim-finnigan tim-finnigan self-assigned this Oct 1, 2024
@tim-finnigan
Copy link
Contributor

Thanks for reaching out. Can you provide more details regarding the sync failure? What error are you getting? The create_data_source command makes a request to the CreateDataSource API, so we'll need more information to investigate if there is some issue with the underlying API behavior.

Can you share a complete code snippet for reproducing the behavior, as well as debug logs (with any sensitive info redacted) by adding boto3.set_stream_logger('') to your script?

@tim-finnigan tim-finnigan added investigating This issue is being investigated and/or work is in progress to resolve the issue. kendra response-requested Waiting on additional information or feedback. p2 This is a standard priority issue and removed investigating This issue is being investigated and/or work is in progress to resolve the issue. needs-triage This issue or PR still needs to be triaged. labels Oct 1, 2024
@ssmails
Copy link
Author

ssmails commented Oct 2, 2024

@tim-finnigan Thanks, unfortunately, there seems to be no error in cloud watch for kendra for this sync failure.

It is simple to reproduce.

  1. use boto3 to create a jira connector with the following jira configuration.
                    'JiraConfiguration': {
                        'JiraAccountUrl': 'working jira url which works from kendra ui',
                        'SecretArn': 'working secret art which works from kendra ui',
                        'IssueSubEntityFilter': ['COMMENTS','ATTACHMENTS','WORKLOGS'],
                        'IssueType': ['BUG','STORY','TASK','EPIC'],
                        'AttachmentFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                        'CommentFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                        'IssueFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                        'ProjectFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                        'WorkLogFieldMappings': [
                            {
                                'DataSourceFieldName': 'url',
                                'IndexFieldName': '_source_uri'
                            },
                        ],
                    }
  1. sync the datasource created - > the sync fails

@tim-finnigan
Copy link
Contributor

Can you share a complete code snippet for reproducing the behavior, as well as debug logs (with any sensitive info redacted) by adding boto3.set_stream_logger('') to your script? I'd like to see the specific API response that is being returned here.

@ssmails
Copy link
Author

ssmails commented Oct 2, 2024

sharing code snippet for the sync failure + logs requested. @tim-finnigan

Note - Datasource was pre-created with 'JiraConfiguration' provided above, using the boto3 sdk. (That operation was successful)

jira-boto3-log-failure.txt

import time
import logging
import boto3
boto3.set_stream_logger('')
from botocore.client import ClientError

def jira_ingest():
    try:
        logging.debug("START.")

        kendra = boto3.client(
            'kendra'
        )
        logging.info("Kendra client initialized successfully.")
        
        response = kendra.start_data_source_sync_job(
            Id="mydatssourceid",
            IndexId="myindexid"
        )
        sync_job_id = response['ExecutionId']
        logging.debug(f"Data source sync job started with ID: {sync_job_id}")
        time.sleep(600)

    except Exception as e:
        logging.error(f"Error: {str(e)}")


if __name__ == '__main__':
    jira_ingest()

@github-actions github-actions bot removed the response-requested Waiting on additional information or feedback. label Oct 3, 2024
@tim-finnigan
Copy link
Contributor

Thanks for following up — from your logs I see that a sync job was started and there is not error or failure present. When you say the sync fails, could you share more details on the failure?

Here in the Kendra Developer Guide is a troubleshooting section on sync jobs failing: https://docs.aws.amazon.com/kendra/latest/dg/troubleshooting-data-sources.html#troubleshooting-data-sources-failed. Have you tried going through the steps documented there

@tim-finnigan tim-finnigan added response-requested Waiting on additional information or feedback. service-api This issue is caused by the service API, not the SDK implementation. labels Oct 7, 2024
@ssmails
Copy link
Author

ssmails commented Oct 8, 2024

The sync shows failed in Kendra without any errors.
I have consulted the developer guide already. Want to mention again that using the same role and policy and other configurations, when I try manually - it works form kendra UI.
So, seems like an issue with boto3 library for jira. I have provide the code snippet to reproduce the issue as well, as requested earlier.

@ssmails
Copy link
Author

ssmails commented Oct 8, 2024

@tim-finnigan appreciate your quick response on this. All requested information has been provided. Thanks.

@github-actions github-actions bot removed the response-requested Waiting on additional information or feedback. label Oct 9, 2024
@tim-finnigan
Copy link
Contributor

The sync shows failed in Kendra without any errors.

From your logs I see that the StartDataSourceSyncJob request succeeded. Can you provide more details on the failure? You can try running list_data_source_sync_jobs to get more info.

Also here are docs on setting up the Jira connector, have you followed those? https://docs.aws.amazon.com/kendra/latest/dg/data-source-jira.html

@tim-finnigan tim-finnigan added the response-requested Waiting on additional information or feedback. label Oct 11, 2024
@ssmails
Copy link
Author

ssmails commented Oct 14, 2024

fyi - we reached out to kendra team and here is their response @tim-finnigan
Appreciate if the boto3 docs can be updated accordingly, so people are aware of the working connectors.
Current boto3 docs , dont mention using the templataized connector at all. But the sample config is for the old mechanism.
We are running into this for several kendra connectors that we are trying to use via boto3.
Is there a list of working connectors with accurate examples that you can provide for boto3 - kendra ?

I hope you are doing well.

I am reaching out with the latest updates on this bug.

The issue is with the non-templatized Jira connector. We are planning on launching a templatized connector which will resolve the issue, but we are still awaiting the date to launch the templatized connector from our team.

Our internal team have tried debugging the issue but were unable to identify a better solution to fix this. I will update you once we have a date for the templatized connector to be launched.

@tim-finnigan
Copy link
Contributor

Thanks for following up. I found the internal tracking item for this issue. Regarding the "non-templatized Jira connector" issue, that is something that the Kendra team would need to address. They should also be able to provide guidance for your use case.

There are not currently Kendra examples for boto3 in the documentation or code examples repository. The Kendra developer guide has a getting started guide but no specific Python SDK examples I could find for data source connectors. If there's a specific connector that you want to try other than the Jira one then let me know and I can try to look into it. But I recommend that you continue your correspondence on the source case with any additional questions to receive further guidance from the appropriate team.

@tim-finnigan tim-finnigan removed the response-requested Waiting on additional information or feedback. label Oct 14, 2024
@tim-finnigan tim-finnigan removed their assignment Oct 14, 2024
@ssmails
Copy link
Author

ssmails commented Oct 18, 2024

@tim-finnigan , Kendra team was able to provide us a waoraround.
If there is a place where you have examples/docs, would be happy to contribute.

@tim-finnigan
Copy link
Contributor

Followed up here in your other issue regarding where examples could be added. The Kendra team followed up on the internal ticket and noted that a fix for the Jira connector was deployed.

Copy link

github-actions bot commented Nov 4, 2024

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a confirmed bug. kendra p2 This is a standard priority issue service-api This issue is caused by the service API, not the SDK implementation.
Projects
None yet
Development

No branches or pull requests

2 participants