add filters for irrelevant files in ISISDATA (#5109)
* adding filters for old files in ISISDATA area * adding script changes and fixing PR * used exlcude instead of exclude-from * cleaned up files and script * added test to see if filtering on rclone ISISDATA worked * modified tests * updating imports in tests * added complete test for filters, list of filter files can be changed * updated test to check args rather than files * removed absolute paths * updated test to use exclude list passed in * Updated filter args I added the following changes to create_rclone_arguments: - I added a default filter flag --filter with the value of exclude_string - I added a check for any additional --include and --exclude flags that may have been passed in by the user. - I added logic to merge any additional --include and --exclude flags with the default --filter flag. The include and exclude patterns are concatenated with the default exclude_string separated by comma. - I removed the f"--exclude={exclude_string}" from extra_args list, since the default exclude_string is now included in the filter flag - I added the filter flag to the extra_args list using extra_args.extend(filter_args) All of the above changes ensures that the --include and --exclude flags passed in by the user are taken into account while creating extra_args, and also the logic will merge these flags with the default filter flag, which is the recommended way as per the rclone docs. * accidentally deleted a line * Added more logic for when the user provides their own filters along with a few other changes: + I added a check for any --filter flag provided by the user, if present it will use it and ignore the default filter flag. Otherwise, it will use the default filter flag. This is done to take into account if the user has provided any specific filter flag, and it will honor the user's intention of providing the filter flag. + I added a check for any additional --include and --exclude flags passed in by the user and merge them with the filter flag. This is to take into account any specific include/exclude patterns that the user wants to apply, and merge them with the default filter flag. + I added a "+" at the end of the filter string if the user has specified an --include flag. This simulates the behavior of --include where it includes any patterns specified and excludes everything else. + I also added a check for filter_args_provided and if provided, it will use this flag, else it will use the default filter flag and merge any additional include/exclude flags to it. There is also a test in the pytest file to check if the filter logic works as expected, run using `pytest`. * fixed tests and adjusted with new filters * Added several tests to pytests **The first test that was added is `test_rclone_with_auth`** This test is designed to check the behavior of the `rclone` function when it is called with an `auth` parameter. This test checks that the `rclone` function properly passes the "auth" parameter to the underlying subprocess call. **The second test that was added is `test_create_rclone_args_with_no_kwargs`** This test is designed to check the behavior of the `create_rclone_args` function when it is called with no keyword arguments. This test checks that the `create_rclone_args` function properly handles the case when it is called with no keyword arguments and returns the correct list of arguments to be passed to the underlying subprocess call **The third test that was added is `test_file_filtering_with_hidden_files`** This test is designed to check the behavior of the `file_filtering` function when it is called with hidden files in the specified directory. This test checks that the `file_filtering` function properly filters out hidden files and only returns the non-hidden files in the specified directory. I have tested to confirm these all run and pass on my machine. * added more tests to pytest **added `test_rclone` test** This test mocks the `subprocess.Popen` function and checks that the output of the rclone function matches the expected output when the function is invoked with the arguments `lsf`, `test`, `["-l", "-R", "--format", "p", "--files-only"]`, `True`, and `True`. **added `test_rclone_unknown_exception` test** This test mocks the `subprocess.Popen` function and checks that the `rclone` function raises an exception when an unknown exception is encountered. This test uses a mocked class that raises an exception when it is initialized. I have tested and confirmed these to work on my system. * fixed filter args fixed how filters are input, patterns still aren't working I believe I need to look at their patterns and ensure the adhere to rclone documentation syntactical instructions. * updated filter list,it starts search from {mission_name}/kernels/ * finally fixed filters and tests * added findfeaturesSegment script * added stuff * fixed parsing issues * cleaning up * more cleaning up * added new regex filter * re-implementing kelvin's changes. accidentally re-based and merged over them. * comma typo * fixed capitalization. * fixed i.e. shortened the regex pattern. shortened it to spk/spk_psp_rec* as the paths it searches in mission areas are actually sub folders of the mission folder. I believe this method saves time and thus it is important to preserve behavior here. * Add dry run flag * Roll back change to dry run --------- Co-authored-by:Kelvin Rodriguez <krodriguez@usgs.gov> Co-authored-by:
Austin Sanders <arsanders@usgs.gov>
Loading
Please register or sign in to comment