Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Migration Complexity Explanation providing the summary and rationale for reported complexity #2166

Merged
merged 11 commits into from
Jan 9, 2025
1 change: 1 addition & 0 deletions migtests/scripts/functions.sh
Original file line number Diff line number Diff line change
Expand Up @@ -891,6 +891,7 @@ normalize_json() {
.OptimalSelectConnectionsPerNode? = "IGNORED" |
.OptimalInsertConnectionsPerNode? = "IGNORED" |
.RowCount? = "IGNORED" |
.MigrationComplexityExplanation?= "IGNORED" |
# Replace newline characters in SqlStatement with spaces
.SqlStatement? |= (
if type == "string" then
Expand Down
15 changes: 15 additions & 0 deletions yb-voyager/cmd/assessMigrationCommand.go
Original file line number Diff line number Diff line change
Expand Up @@ -425,6 +425,7 @@ func assessMigration() (err error) {
}

log.Infof("number of assessment issues detected: %d\n", len(assessmentReport.Issues))

utils.PrintAndLog("Migration assessment completed successfully.")
completedEvent := createMigrationAssessmentCompletedEvent()
controlPlane.MigrationAssessmentCompleted(completedEvent)
Expand Down Expand Up @@ -1602,6 +1603,14 @@ func postProcessingOfAssessmentReport() {
func generateAssessmentReportJson(reportDir string) error {
jsonReportFilePath := filepath.Join(reportDir, fmt.Sprintf("%s%s", ASSESSMENT_FILE_NAME, JSON_EXTENSION))
log.Infof("writing assessment report to file: %s", jsonReportFilePath)

var err error
assessmentReport.MigrationComplexityExplanation, err = buildMigrationComplexityExplanation(source.DBType, assessmentReport, "")
if err != nil {
utils.PrintAndLog("ERROR: unable to build migration complexity explanation for json report: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why utils.PrintAndLog, use return fmt.Errorf(..) ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of showing something on console also for users to know that explanation is not there for xyz reason

Copy link
Contributor

@priyanshi-yb priyanshi-yb Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The top level functions which is calling the generateAssessmentReportJson must be printing the error on console for users so it will come eventually right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this not look good? any suggestions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The top level functions which is calling the generateAssessmentReportJson must be printing the error on console for users so it will come eventually right?

No, what i did was to not error out the cmd if building explanation fails...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you suggesting to error out in this scenario?

Copy link
Contributor

@priyanshi-yb priyanshi-yb Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. Ideally, we should return the error,and mostly the errors occur while buildMigrationComplexityExplanation is generating HTML string, etc.
For the case, we should fail the command for this new addition, we can use env var approach and if user faces any blocker we can always ask them to disable it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm makes sense, this is the standard code - creating template and executing it.
Should not be error here ideally.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
log.Info(assessmentReport.MigrationComplexityExplanation)

strReport, err := json.MarshalIndent(assessmentReport, "", "\t")
if err != nil {
return fmt.Errorf("failed to marshal the assessment report: %w", err)
Expand All @@ -1620,6 +1629,12 @@ func generateAssessmentReportHtml(reportDir string) error {
htmlReportFilePath := filepath.Join(reportDir, fmt.Sprintf("%s%s", ASSESSMENT_FILE_NAME, HTML_EXTENSION))
log.Infof("writing assessment report to file: %s", htmlReportFilePath)

var err error
assessmentReport.MigrationComplexityExplanation, err = buildMigrationComplexityExplanation(source.DBType, assessmentReport, "html")
if err != nil {
utils.PrintAndLog("ERROR: unable to build migration complexity explanation for html report: %v", err)
sanyamsinghal marked this conversation as resolved.
Show resolved Hide resolved
}

file, err := os.Create(htmlReportFilePath)
if err != nil {
return fmt.Errorf("failed to create file for %q: %w", filepath.Base(htmlReportFilePath), err)
Expand Down
17 changes: 9 additions & 8 deletions yb-voyager/cmd/common.go
Original file line number Diff line number Diff line change
Expand Up @@ -1051,14 +1051,15 @@ func storeTableListInMSR(tableList []sqlname.NameTuple) error {

// TODO: consider merging all unsupported field with single AssessmentReport struct member as AssessmentIssue
type AssessmentReport struct {
VoyagerVersion string `json:"VoyagerVersion"`
TargetDBVersion *ybversion.YBVersion `json:"TargetDBVersion"`
MigrationComplexity string `json:"MigrationComplexity"`
SchemaSummary utils.SchemaSummary `json:"SchemaSummary"`
Sizing *migassessment.SizingAssessmentReport `json:"Sizing"`
Issues []AssessmentIssue `json:"-"` // disabled in reports till corresponding UI changes are done(json and html reports)
TableIndexStats *[]migassessment.TableIndexStats `json:"TableIndexStats"`
Notes []string `json:"Notes"`
VoyagerVersion string `json:"VoyagerVersion"`
TargetDBVersion *ybversion.YBVersion `json:"TargetDBVersion"`
MigrationComplexity string `json:"MigrationComplexity"`
MigrationComplexityExplanation string `json:"MigrationComplexityExplanation"`
sanyamsinghal marked this conversation as resolved.
Show resolved Hide resolved
SchemaSummary utils.SchemaSummary `json:"SchemaSummary"`
Sizing *migassessment.SizingAssessmentReport `json:"Sizing"`
Issues []AssessmentIssue `json:"-"` // disabled in reports till corresponding UI changes are done(json and html reports)
TableIndexStats *[]migassessment.TableIndexStats `json:"TableIndexStats"`
Notes []string `json:"Notes"`

// fields going to be deprecated
UnsupportedDataTypes []utils.TableColumnsDataTypes `json:"UnsupportedDataTypes"`
Expand Down
40 changes: 21 additions & 19 deletions yb-voyager/cmd/common_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -129,21 +129,22 @@ func TestAssessmentReportStructs(t *testing.T) {
name: "Validate AssessmentReport Struct Definition",
actualType: reflect.TypeOf(AssessmentReport{}),
expectedType: struct {
VoyagerVersion string `json:"VoyagerVersion"`
TargetDBVersion *ybversion.YBVersion `json:"TargetDBVersion"`
MigrationComplexity string `json:"MigrationComplexity"`
SchemaSummary utils.SchemaSummary `json:"SchemaSummary"`
Sizing *migassessment.SizingAssessmentReport `json:"Sizing"`
Issues []AssessmentIssue `json:"-"`
TableIndexStats *[]migassessment.TableIndexStats `json:"TableIndexStats"`
Notes []string `json:"Notes"`
UnsupportedDataTypes []utils.TableColumnsDataTypes `json:"UnsupportedDataTypes"`
UnsupportedDataTypesDesc string `json:"UnsupportedDataTypesDesc"`
UnsupportedFeatures []UnsupportedFeature `json:"UnsupportedFeatures"`
UnsupportedFeaturesDesc string `json:"UnsupportedFeaturesDesc"`
UnsupportedQueryConstructs []utils.UnsupportedQueryConstruct `json:"UnsupportedQueryConstructs"`
UnsupportedPlPgSqlObjects []UnsupportedFeature `json:"UnsupportedPlPgSqlObjects"`
MigrationCaveats []UnsupportedFeature `json:"MigrationCaveats"`
VoyagerVersion string `json:"VoyagerVersion"`
TargetDBVersion *ybversion.YBVersion `json:"TargetDBVersion"`
MigrationComplexity string `json:"MigrationComplexity"`
MigrationComplexityExplanation string `json:"MigrationComplexityExplanation"`
SchemaSummary utils.SchemaSummary `json:"SchemaSummary"`
Sizing *migassessment.SizingAssessmentReport `json:"Sizing"`
Issues []AssessmentIssue `json:"-"`
TableIndexStats *[]migassessment.TableIndexStats `json:"TableIndexStats"`
Notes []string `json:"Notes"`
UnsupportedDataTypes []utils.TableColumnsDataTypes `json:"UnsupportedDataTypes"`
UnsupportedDataTypesDesc string `json:"UnsupportedDataTypesDesc"`
UnsupportedFeatures []UnsupportedFeature `json:"UnsupportedFeatures"`
UnsupportedFeaturesDesc string `json:"UnsupportedFeaturesDesc"`
UnsupportedQueryConstructs []utils.UnsupportedQueryConstruct `json:"UnsupportedQueryConstructs"`
UnsupportedPlPgSqlObjects []UnsupportedFeature `json:"UnsupportedPlPgSqlObjects"`
MigrationCaveats []UnsupportedFeature `json:"MigrationCaveats"`
}{},
},
}
Expand All @@ -165,9 +166,10 @@ func TestAssessmentReportJson(t *testing.T) {
}

assessmentReport = AssessmentReport{
VoyagerVersion: "v1.0.0",
TargetDBVersion: newYbVersion,
MigrationComplexity: "High",
VoyagerVersion: "v1.0.0",
TargetDBVersion: newYbVersion,
MigrationComplexity: "High",
MigrationComplexityExplanation: "",
SchemaSummary: utils.SchemaSummary{
Description: "Test Schema Summary",
DBName: "test_db",
Expand Down Expand Up @@ -302,12 +304,12 @@ func TestAssessmentReportJson(t *testing.T) {
if err != nil {
t.Fatalf("Failed to write assessment report to JSON file: %v", err)
}

// expected JSON
expectedJSON := `{
"VoyagerVersion": "v1.0.0",
"TargetDBVersion": "2024.1.1.1",
"MigrationComplexity": "High",
"MigrationComplexityExplanation": "",
"SchemaSummary": {
"Description": "Test Schema Summary",
"DbName": "test_db",
Expand Down
174 changes: 154 additions & 20 deletions yb-voyager/cmd/migration_complexity.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,14 @@ limitations under the License.
package cmd

import (
"bytes"
"encoding/csv"
"fmt"
"math"
"os"
"path/filepath"
"strings"
"text/template"

"github.com/samber/lo"
log "github.com/sirupsen/logrus"
Expand All @@ -33,15 +35,16 @@ import (
const NOT_AVAILABLE = "NOT AVAILABLE"

var (
LEVEL_1_MEDIUM_THRESHOLD = 20
LEVEL_1_HIGH_THRESHOLD = math.MaxInt32
LEVEL_2_MEDIUM_THRESHOLD = 10
LEVEL_2_HIGH_THRESHOLD = 100
LEVEL_3_MEDIUM_THRESHOLD = 0
LEVEL_3_HIGH_THRESHOLD = 4
LEVEL_1_MEDIUM_THRESHOLD = 20
LEVEL_1_HIGH_THRESHOLD = math.MaxInt32
LEVEL_2_MEDIUM_THRESHOLD = 10
LEVEL_2_HIGH_THRESHOLD = 100
LEVEL_3_MEDIUM_THRESHOLD = 0
LEVEL_3_HIGH_THRESHOLD = 4
migrationComplexityRationale string
)

// Migration complexity calculation from the conversion issues
// Migration complexity calculation based on the detected assessment issues
func calculateMigrationComplexity(sourceDBType string, schemaDirectory string, assessmentReport AssessmentReport) string {
if sourceDBType != ORACLE && sourceDBType != POSTGRESQL {
return NOT_AVAILABLE
Expand All @@ -64,30 +67,37 @@ func calculateMigrationComplexity(sourceDBType string, schemaDirectory string, a
}

func calculateMigrationComplexityForPG(assessmentReport AssessmentReport) string {
if assessmentReport.MigrationComplexity != "" {
return assessmentReport.MigrationComplexity
}

counts := lo.CountValuesBy(assessmentReport.Issues, func(issue AssessmentIssue) string {
return issue.Impact
})
l1IssueCount := counts[constants.IMPACT_LEVEL_1]
l2IssueCount := counts[constants.IMPACT_LEVEL_2]
l3IssueCount := counts[constants.IMPACT_LEVEL_3]

level1IssueCount := counts[constants.IMPACT_LEVEL_1]
level2IssueCount := counts[constants.IMPACT_LEVEL_2]
level3IssueCount := counts[constants.IMPACT_LEVEL_3]
log.Infof("issue counts: level-1=%d, level-2=%d, level-3=%d\n", l1IssueCount, l2IssueCount, l3IssueCount)

utils.PrintAndLog("issue counts: level-1=%d, level-2=%d, level-3=%d\n", level1IssueCount, level2IssueCount, level3IssueCount)
// Determine complexity for each level
comp1 := getComplexityForLevel(constants.IMPACT_LEVEL_1, level1IssueCount)
comp2 := getComplexityForLevel(constants.IMPACT_LEVEL_2, level2IssueCount)
comp3 := getComplexityForLevel(constants.IMPACT_LEVEL_3, level3IssueCount)
comp1 := getComplexityForLevel(constants.IMPACT_LEVEL_1, l1IssueCount)
comp2 := getComplexityForLevel(constants.IMPACT_LEVEL_2, l2IssueCount)
comp3 := getComplexityForLevel(constants.IMPACT_LEVEL_3, l3IssueCount)
complexities := []string{comp1, comp2, comp3}
log.Infof("complexities according to each level: %v", complexities)

finalComplexity := constants.MIGRATION_COMPLEXITY_LOW
// If ANY level is HIGH => final is HIGH
if slices.Contains(complexities, constants.MIGRATION_COMPLEXITY_HIGH) {
return constants.MIGRATION_COMPLEXITY_HIGH
finalComplexity = constants.MIGRATION_COMPLEXITY_HIGH
} else if slices.Contains(complexities, constants.MIGRATION_COMPLEXITY_MEDIUM) {
// Else if ANY level is MEDIUM => final is MEDIUM
finalComplexity = constants.MIGRATION_COMPLEXITY_MEDIUM
}
// Else if ANY level is MEDIUM => final is MEDIUM
if slices.Contains(complexities, constants.MIGRATION_COMPLEXITY_MEDIUM) {
return constants.MIGRATION_COMPLEXITY_MEDIUM
}
return constants.MIGRATION_COMPLEXITY_LOW

migrationComplexityRationale = buildRationale(finalComplexity, l1IssueCount, l2IssueCount, l3IssueCount)
return finalComplexity
}

// This is a temporary logic to get migration complexity for oracle based on the migration level from ora2pg report.
Expand Down Expand Up @@ -200,3 +210,127 @@ func getComplexityForLevel(level string, count int) string {
panic(fmt.Sprintf("unknown impact level %s for determining complexity", level))
}
}

// ======================================= Migration Complexity Explanation ==========================================

const explainTemplateHTML = `
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like to avoid having HTML in multiple places. It should all ideally be in the template file. Morevoer, field AssessmentReport.MigrationComplexityExplanation having HTML would be weird.
But I get why you did this. (because json and HTML have different requirements).

Here's one way to do this:

  1. in AssessmentReport, instead of storing just the final string in MigrationComplexityExplanation, store the actual ExplanationData struct.
  2. in HTML template, you have all the necessary information to build the table there.
  3. for json, just exclude the summaries from it. (in fact there's no harm in even keeping it tbh).

Copy link
Contributor

@priyanshi-yb priyanshi-yb Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree to this, we should keep all the HTML code in that template

for json, just exclude the summaries from it. (in fact there's no harm in even keeping it tbh).

yes we can keep it in JSON as well, in case of tests and all you are anyways ignoring it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a TODO comment

<p>Below is a detailed breakdown of issues by category, showing the count for each impact level.</p>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about Below is a breakdown of the issues detected in different categories for each impact level ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah better.

<table border="1" cellpadding="5" cellspacing="0" style="border-collapse: collapse;">
<thead>
<tr>
<th>Category</th>
<th>Level-1</th>
<th>Level-2</th>
<th>Level-3</th>
<th>Total</th>
</tr>
</thead>
<tbody>
{{- range .Summaries }}
<tr>
<td>{{ .Category }}</td>
<td>{{ index .ImpactCounts "LEVEL_1" }}</td>
<td>{{ index .ImpactCounts "LEVEL_2" }}</td>
<td>{{ index .ImpactCounts "LEVEL_3" }}</td>
<td>{{ .TotalIssueCount }}</td>
</tr>
{{- end }}
</tbody>
</table>

<p>
<strong>Complexity:</strong> {{ .Complexity }}</br>
<strong>Reasoning:</strong> {{ .ComplexityRationale }}
</p>

<p>
<strong>Impact Levels:</strong></br>
Level-1: Resolutions are available with minimal effort.<br/>
Level-2
: Resolutions are available requiring moderate effort.<br/>
Level-3: Resolutions may not be available or are complex.
</p>
`

const explainTemplateText = `Reasoning: {{ .ComplexityRationale }}`

type MigrationComplexityExplanationData struct {
Summaries []MigrationComplexityCategorySummary
Complexity string
ComplexityRationale string // short reasoning or explanation text
}

type MigrationComplexityCategorySummary struct {
Category string
TotalIssueCount int
ImpactCounts map[string]int // e.g. {"Level-1": 3, "Level-2": 5, "Level-3": 2}
}

func buildMigrationComplexityExplanation(sourceDBType string, assessmentReport AssessmentReport, reportFormat string) (string, error) {
if sourceDBType != POSTGRESQL {
return "", nil
}

var explanation MigrationComplexityExplanationData
explanation.Complexity = assessmentReport.MigrationComplexity
explanation.ComplexityRationale = migrationComplexityRationale

explanation.Summaries = buildCategorySummary(assessmentReport.Issues)

var tmpl *template.Template
var err error
if reportFormat == "html" {
tmpl, err = template.New("Explain").Parse(explainTemplateHTML)
} else {
tmpl, err = template.New("Explain").Parse(explainTemplateText)
}

sanyamsinghal marked this conversation as resolved.
Show resolved Hide resolved
if err != nil {
return "", fmt.Errorf("failed creating the explanation template: %w", err)
}

var buf bytes.Buffer
if err := tmpl.Execute(&buf, explanation); err != nil {
return "", fmt.Errorf("failed executing the template with data: %w", err)
}
return buf.String(), nil
}

func buildRationale(finalComplexity string, l1Count int, l2Count int, l3Count int) string {
switch finalComplexity {
case constants.MIGRATION_COMPLEXITY_HIGH:
return fmt.Sprintf("Found %d Level-2 issue(s) and %d Level-3 issue(s), resulting in HIGH migration complexity", l2Count, l3Count)
case constants.MIGRATION_COMPLEXITY_MEDIUM:
return fmt.Sprintf("Found %d Level-1 issue(s), %d Level-2 issue(s) and %d Level-3 issue(s), resulting in MEDIUM migration complexity", l1Count, l2Count, l3Count)
case constants.MIGRATION_COMPLEXITY_LOW:
return fmt.Sprintf("Found %d Level-1 issue(s) and %d Level-2 issue(s), resulting in LOW migration complexity", l1Count, l2Count)
}
return ""
}

func buildCategorySummary(issues []AssessmentIssue) []MigrationComplexityCategorySummary {
summaryMap := make(map[string]*MigrationComplexityCategorySummary)
for _, issue := range issues {
if issue.Category == "" {
continue // skipping unknown category issues
}

if _, ok := summaryMap[issue.Category]; !ok {
summaryMap[issue.Category] = &MigrationComplexityCategorySummary{
Category: issue.Category,
TotalIssueCount: 0,
ImpactCounts: make(map[string]int),
}
}

summaryMap[issue.Category].TotalIssueCount++
summaryMap[issue.Category].ImpactCounts[issue.Impact]++
}

var result []MigrationComplexityCategorySummary
for _, summary := range summaryMap {
summary.Category = utils.SnakeCaseToTitleCase(summary.Category)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using this function we can get the Feature heading for that Enum to be consistent with the heading in report.
but given that we are anyways changing the UI, we can revisit it later so okay for now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah lets revisit this. The category here is being fetched from the analyze schema issue.

I see the same issues of formatting in schema reports. We need to fix the root.

result = append(result, *summary)
}
return result
}
5 changes: 5 additions & 0 deletions yb-voyager/cmd/templates/migration_assessment_report.template
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,11 @@
{{ end }}
{{end}}

{{if ne .MigrationComplexity "NOT AVAILABLE"}}
<h2>Migration Complexity Explanation</h2>
<p>{{ .MigrationComplexityExplanation }}</p>
{{end}}

<h2>Unsupported Data Types</h2>
<p>{{.UnsupportedDataTypesDesc}}</p>
{{ if .UnsupportedDataTypes }}
Expand Down
13 changes: 13 additions & 0 deletions yb-voyager/src/utils/utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@ import (
"github.com/samber/lo"
log "github.com/sirupsen/logrus"
"golang.org/x/exp/slices"
"golang.org/x/text/cases"
"golang.org/x/text/language"
)

var DoNotPrompt bool
Expand Down Expand Up @@ -744,3 +746,14 @@ func CheckTools(tools ...string) []string {
func BuildObjectName(schemaName, objName string) string {
return lo.Ternary(schemaName != "", schemaName+"."+objName, objName)
}

// SnakeCaseToTitleCase converts a snake_case string to a title case string with spaces.
func SnakeCaseToTitleCase(snake string) string {
words := strings.Split(snake, "_")
c := cases.Title(language.English)
for i, word := range words {
words[i] = c.String(word)
}

return strings.Join(words, " ")
}