Skip to content

Commit

Permalink
File list into xml.read_archive also
Browse files Browse the repository at this point in the history
  • Loading branch information
remi-braun committed Dec 10, 2024
1 parent 271f022 commit 2dbae25
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 3 deletions.
2 changes: 1 addition & 1 deletion CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
- OPTIM: Don't download an archive stored on the cloud when trying to read a vector stored inside it in `vectors.read`
- OPTIM: Don't download files stored on cloud when applying `ci.assert_files_equal` on them
- OPTIM: Offer the ability to give the archived file list directly to `path.get_archived_file_list` and `files.read_archived_file`, as this operation is expensive when done with large archives stored on the cloud (and thus better done only once).
Propagated into `path.get_archived_path`, `path.get_archived_rio_path`, `vectors.read`, `files.read_archived_xml` and `files.read_archived_html`
Propagated into `path.get_archived_path`, `path.get_archived_rio_path`, `vectors.read`, `xml.read_archive`, files.read_archived_xml` and `files.read_archived_html`

## 1.44.0 (2024-12-09)

Expand Down
7 changes: 5 additions & 2 deletions sertit/xml.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,9 @@ def read(xml_path: AnyPathStrType) -> _Element:
return root


def read_archive(path: AnyPathStrType, regex: str = None) -> _Element:
def read_archive(
path: AnyPathStrType, regex: str = None, file_list: list = None
) -> _Element:
"""
Read an XML file from inside an archive (zip or tar)
Convenient duplicate of :code:`files.read_archived_xml`
Expand All @@ -86,6 +88,7 @@ def read_archive(path: AnyPathStrType, regex: str = None) -> _Element:
Args:
path (AnyPathStrType): Path to the XML file, stored inside an archive or path to the archive itself
regex (str): Optional. If specified, the path should be the archive path and the regex should be the key to find the XML file inside the archive.
file_list (list): List of files contained in the archive. Optional, if not given it will be re-computed.
Returns:
_Element: XML Root
Expand All @@ -98,7 +101,7 @@ def read_archive(path: AnyPathStrType, regex: str = None) -> _Element:
if path.startswith("zip://") or path.startswith("tar://"):
path = path[5:]

return files.read_archived_xml(path, regex)
return files.read_archived_xml(path, regex, file_list=file_list)

except XMLSyntaxError:
raise ValueError(f"Invalid metadata XML for {path}!")
Expand Down

0 comments on commit 2dbae25

Please sign in to comment.