Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enceladus ECS rollback script: first version for evaluation #2201

Merged
merged 7 commits into from
Jan 5, 2024

Conversation

dk1844
Copy link
Contributor

@dk1844 dk1844 commented Dec 8, 2023

Continuation of #2197. This PR introduces a rollback feature in a separate script for dataset_paths_to_ecs.

Naively Dev-Tested on local mongoDB.

Examples

Help with params overview:

> python dataset_paths_ecs_rollback.py -h
usage: dataset_paths_ecs_rollback [-h] [-n] [-v] [-t TARGETDB] [-s SKIP_PREFIX [SKIP_PREFIX ...]] [-f {hdfsPath,hdfsPublishPath,all}] [-o] (-d DATASET_NAME [DATASET_NAME ...] | -m MTABLE_NAME [MTABLE_NAME ...]) TARGET

Menas MongoDB rollback script changes to ECS

positional arguments:
  TARGET                connection string for target MongoDB

options:
  -h, --help            show this help message and exit
  -n, --dryrun          If specified, skip the actual changes, just print what would be done. (default: False)
  -v, --verbose         Prints extra information while running. (default: False)
  -t TARGETDB, --target-database TARGETDB
                        Name of db on target to be affected. (default: menas)
  -s SKIP_PREFIX [SKIP_PREFIX ...], --skip-prefixes SKIP_PREFIX [SKIP_PREFIX ...]
                        Paths with these prefixes will be skipped from rollback. (default: [])
  -f {hdfsPath,hdfsPublishPath,all}, --fields-to-map {hdfsPath,hdfsPublishPath,all}
                        Rollback either item's 'hdfsPath', 'hdfsPublishPath' or 'all'. (default: all)
  -o, --only-datasets   if specified, only dataset rollback path changes will be done (not MTs). (default: False)
  -d DATASET_NAME [DATASET_NAME ...], --datasets DATASET_NAME [DATASET_NAME ...]
                        list datasets names to rollback path changes in (default: [])
  -m MTABLE_NAME [MTABLE_NAME ...], --mapping-tables MTABLE_NAME [MTABLE_NAME ...]
                        list mapping tables names to rollback path changes in (default: [])

Example rollback run for a single dataset

-d dataset(s)
-t target db
-v verbose

> python dataset_paths_ecs_rollback.py mongodb://localhost:27017/admin -d XMSK083 -t menas_remap_test -v   
Menas mongo ECS paths mapping ROLLBACK
Running with settings: dryrun=False, verbose=True
Skipping prefixes: []
  target connection-string: mongodb://localhost:27017/admin
  target DB: menas_remap_test
Dataset names given: ['XMSK083']
Dataset names to rollback path-changes (actually found db): ['XMSK083']
MTs for path-change rollback: ['SourceSystemMappingTable']

Rollbacking path-change of collection dataset_v1 started
Found: 1 dataset documents for a potential path change. In progress ...
Rollbacking path changes for dataset 'XMSK083' v5 (_id=5bbc544b2cdc7510a4930f1f).
  *rollbacking*: hdfsPublishPath: -> /bigdatahdfs/datalake/publish/cpf/XMSK083/
Successfully rollbacked changed path for dataset 'XMSK083' v5 (_id=5bbc544b2cdc7510a4930f1f).

Successfully rollbacked 1 of 1 dataset entries, failed: 0

Rollbacking path-change of collection mapping_table_v1 started
Found: 2 mapping table documents for a potential path change. In progress ...
Rollbacking path changes for mapping table 'SourceSystemMappingTable' v5 (_id=5b6d732ba43a28a6151422aa).
Nothing left to rollback for mapping table 'SourceSystemMappingTable' v5 (_id=5b6d732ba43a28a6151422aa).
Rollbacking path changes for mapping table 'SourceSystemMappingTable' v1 (_id=5abbaa1e8cdba293c9f0b5a3).
  *rollbacking*: hdfsPath: -> /bigdatahdfs/datalake/common/mdrc/publish/LATEST5/SourceSystemMapping
Successfully rollbacked changed path for mapping table 'SourceSystemMappingTable' v1 (_id=5abbaa1e8cdba293c9f0b5a3).

Successfully rollbacked 1 of 2 mapping table entries, failed: 0

Done.

Example rollback run for a single dataset with field mapping (dryrun)

-d dataset(s)
-t target db
-v verbose
-f only hdfsPublishPath will be rollbacked (so hdfsPath) is not looked at
-n no changes done, just print

> python dataset_paths_ecs_rollback.py mongodb://localhost:27017/admin -d DM9_actn_Cd -t menas_remap_test -f hdfsPublishPath -v -n
Menas mongo ECS paths mapping ROLLBACK
Running with settings: dryrun=True, verbose=True
Skipping prefixes: []
  target connection-string: mongodb://localhost:27017/admin
  target DB: menas_remap_test
Dataset names given: ['DM9_actn_Cd']
Dataset names to rollback path-changes (actually found db): ['DM9_actn_Cd']
MTs for path-change rollback: []

Rollbacking path-change of collection dataset_v1 started
Found: 2 dataset documents for a potential path change. In progress ...
Rollbacking path changes for dataset 'DM9_actn_Cd' v1 (_id=5bc4821f2da5c3eb4c3c38ca).
  *would rollback* hdfsPublishPath: -> /bigdatahdfs/datalake/publish/dm9/ACTN_CD/country_code=KEN

Rollbacking path changes for dataset 'DM9_actn_Cd' v4 (_id=5c8742fa023679429fef4959).
  *would rollback* hdfsPublishPath: -> /bigdatahdfs/datalake/publish/dm9/ACTN_CD/country_code=BBT
Successfully rollbacked 0 of 2 dataset entries, failed: 0

No mapping tables to rollback path-changes in mapping_table_v1, skipping.
Done.

Example rollback run for a single mapping table

-m mapping table(s)
-t target db
-v verbose

> python dataset_paths_ecs_rollback.py mongodb://localhost:27017/admin -m SourceSystemMappingTable -t menas_remap_test -v   
Menas mongo ECS paths mapping ROLLBACK
Running with settings: dryrun=False, verbose=True
Skipping prefixes: []
  target connection-string: mongodb://localhost:27017/admin
  target DB: menas_remap_test
Mapping table names supplied: ['SourceSystemMappingTable']
Mapping table names given: ['SourceSystemMappingTable']
Mapping table names to rollback path-changes (actually found db): ['SourceSystemMappingTable']

Rollbacking path-change of collection mapping_table_v1 started
Found: 2 mapping table documents for a potential path change. In progress ...
Rollbacking path changes for mapping table 'SourceSystemMappingTable' v5 (_id=5b6d732ba43a28a6151422aa).
Nothing left to rollback for mapping table 'SourceSystemMappingTable' v5 (_id=5b6d732ba43a28a6151422aa).
Rollbacking path changes for mapping table 'SourceSystemMappingTable' v1 (_id=5abbaa1e8cdba293c9f0b5a3).
  *rollbacking*: hdfsPath: -> /bigdatahdfs/datalake/common/mdrc/publish/LATEST5/SourceSystemMapping
Successfully rollbacked changed path for mapping table 'SourceSystemMappingTable' v1 (_id=5abbaa1e8cdba293c9f0b5a3).

Successfully rollbacked 1 of 2 mapping table entries, failed: 0

Done.

Enceladus ECS mapping script - documentation changes
@dk1844 dk1844 marked this pull request as ready for review December 8, 2023 13:04
Copy link
Collaborator

@miroslavpojer miroslavpojer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • pulled
  • code review
  • run

scripts/migration/dataset_paths_ecs_rollback.py Outdated Show resolved Hide resolved
scripts/migration/dataset_paths_ecs_rollback.py Outdated Show resolved Hide resolved
scripts/migration/dataset_paths_ecs_rollback.py Outdated Show resolved Hide resolved
scripts/migration/dataset_paths_ecs_rollback.py Outdated Show resolved Hide resolved
scripts/migration/dataset_paths_ecs_rollback.py Outdated Show resolved Hide resolved
dk1844 and others added 6 commits December 14, 2023 16:23
Enceladus ECS mapping script - -m/--mapping-tables option added
Co-authored-by: miroslavpojer <miroslav.pojer@absa.africa>
Co-authored-by: miroslavpojer <miroslav.pojer@absa.africa>
Co-authored-by: miroslavpojer <miroslav.pojer@absa.africa>
Co-authored-by: miroslavpojer <miroslav.pojer@absa.africa>
Copy link

sonarcloud bot commented Dec 18, 2023

Quality Gate Passed Quality Gate passed

The SonarCloud Quality Gate passed, but some issues were introduced.

8 New issues
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

Copy link
Collaborator

@miroslavpojer miroslavpojer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • pulled
  • code review done

scripts/migration/dataset_paths_to_ecs.py Show resolved Hide resolved
Copy link
Collaborator

@miroslavpojer miroslavpojer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving after last asnwer.

@miroslavpojer miroslavpojer added the PR:tested Only for PR - PR was tested by a tester (person) label Jan 5, 2024
@dk1844 dk1844 merged commit 73f9f7b into develop Jan 5, 2024
5 of 7 checks passed
@dk1844 dk1844 deleted the feature/ecs-mapping-rollback-script branch January 5, 2024 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR:tested Only for PR - PR was tested by a tester (person)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants