ocifs classes

OCIFileSystem

class ocifs.core.OCIFileSystem(*args, **kwargs)

Bases: AbstractFileSystem

Access oci as if it were a file system.

This exposes a filesystem-like API (ls, cp, open, etc.) on top of oci object storage.

Parameters
  • config (Union[dict, str, None]) – Config for the connection to OCI. If a dict, it should be returned from oci.config.from_file If a str, it should be the location of the config file If None, user should have a Resource Principal configured environment If Resource Principal is not available, Instance Principal

  • signer (oci.auth.signer) – A signer from the OCI sdk. More info: oci.auth.signers

  • profile (str) – The profile to use from the config (If the config is passed in)

  • iam_type (str (None)) – The IAM Auth principal type to use. Values can be one of [“api_key”, “resource_principal”, “instance_principal”]

  • region (str (None)) – The Region Identifier that the client should connnect to. Regions can be found here: https://docs.oracle.com/en-us/iaas/Content/General/Concepts/regions.htm

  • default_block_size (int (None)) – If given, the default block size value used for open(), if no specific value is given at all time. The built-in default is 5MB.

  • config_kwargs (dict) – dict of parameters passed to the OCI Client upon connection more info here: oci.object_storage.ObjectStorageClient.__init__

  • oci_additional_kwargs (dict) – dict of parameters that are used when calling oci api methods. Typically used for things like “retry_strategy”.

  • kwargs (dict) – dict of other parameters for oci session This includes default parameters for tenancy, namespace, and region Any other parameters are passed along to AbstractFileSystem’s init method.

async_impl = False
blocksize = 4194304
bulk_delete(pathlist, **kwargs)

Remove multiple keys with one call :param pathlist: The keys to remove, must all be in the same bucket. :type pathlist: listof strings

cachable = True
cat(path, recursive=False, on_error='raise', **kwargs)

Fetch (potentially multiple) paths’ contents

Parameters
  • recursive (bool) – If True, assume the path(s) are directories, and get all the contained files

  • on_error ("raise", "omit", "return") – If raise, an underlying exception will be raised (converted to KeyError if the type is in self.missing_exceptions); if omit, keys with exception will simply not be included in the output; if “return”, all keys are included in the output, but the value will be bytes or an exception instance.

  • kwargs (passed to cat_file) –

Returns

  • dict of {path (contents} if there are multiple paths)

  • or the path has been otherwise expanded

cat_file(path, start=None, end=None, **kwargs)

Get the content of a file

Parameters
  • path (URL of file on this filesystems) –

  • end (start,) – Bytes limits of the read. If negative, backwards from end, like usual python slices. Either can be None for start or end of file, respectively

  • kwargs (passed to open().) –

cat_ranges(paths, starts, ends, max_gap=None, on_error='return', **kwargs)
checksum(path, **kwargs)

Unique value for current version of file

If the checksum is the same from one moment to another, the contents are guaranteed to be the same. If the checksum changes, the contents might have changed.

Parameters
  • path (string/bytes) – path of file to get checksum for

  • refresh (bool (=False)) – if False, look in local cache for file details first

classmethod clear_instance_cache()

Clear the cache of filesystem instances.

Notes

Unless overridden by setting the cachable class attribute to False, the filesystem class stores a reference to newly created instances. This prevents Python’s normal rules around garbage collection from working, since the instances refcount will not drop to zero until clear_instance_cache is called.

connect(refresh=True)

Establish oci connection object.

Parameters

refresh (bool) – Whether to create new session/client, even if a previous one with the same parameters already exists. If False (default), an existing one will be used if possible.

connect_timeout = 5
copy(path1, path2, destination_region=None, **kwargs)

Copy file between locations on OCI

Parameters
  • path1 (str) – URI of source data

  • path2 (str) – URI of destination data

  • destination_region (str) – the region you want path2 to be written in (defaults region of your config)

copy_basic(path1, path2, destination_region=None, **kwargs)

Copy file between locations on OCI

Not allowed where the origin is >50GB

Parameters
  • path1 (str) – URI of source data

  • path2 (str) – URI of destination data

  • destination_region (str) – the region you want path2 to be written in (defaults region of your config)

cp(path1, path2, **kwargs)

Alias of AbstractFileSystem.copy.

cp_file(path1, path2, **kwargs)
created(path)

Return the created timestamp of a file as a datetime.datetime

classmethod current()

Return the most recently instantiated FileSystem

If no instance has been created, then create one with defaults

default_block_size = 5242880
delete(path, recursive=False, maxdepth=None)

Alias of AbstractFileSystem.rm.

disk_usage(path, total=True, maxdepth=None, **kwargs)

Alias of AbstractFileSystem.du.

download(rpath, lpath, recursive=False, **kwargs)

Alias of AbstractFileSystem.get.

du(path, total=True, maxdepth=None, **kwargs)

Space used by files within a path

Parameters
  • path (str) –

  • total (bool) – whether to sum all the file sizes

  • maxdepth (int or None) – maximum number of directory levels to descend, None for unlimited.

  • kwargs (passed to ls) –

Returns

  • Dict of {fn (size} if total=False, or int otherwise, where numbers)

  • refer to bytes used.

end_transaction()

Finish write transaction, non-context version

exists(path, **kwargs)

Is there a file at the given path

expand_path(path, recursive=False, maxdepth=None)

Turn one or more globs or directories into a list of all matching paths to files or directories.

find(path, maxdepth=None, withdirs=False, detail=False, **kwargs)

List all files below path.

Like posix find command without conditions

Parameters
  • path (str) –

  • maxdepth (int or None) – If not None, the maximum number of levels to descend

  • withdirs (bool) – Whether to include directory paths in the output. This is True when used by glob, but users usually only want files.

  • ls. (kwargs are passed to) –

static from_json(blob)

Recreate a filesystem instance from JSON representation

See .to_json() for the expected structure of the input

Parameters

blob (str) –

Returns

Return type

file system instance, not necessarily of this particular class.

get(rpath, lpath, recursive=False, callback=<fsspec.callbacks.NoOpCallback object>, **kwargs)

Copy file(s) to local.

Copies a specific file or tree of files (if recursive=True). If lpath ends with a “/”, it will be assumed to be a directory, and target files will go within. Can submit a list of paths, which may be glob-patterns and will be expanded.

Calls get_file for each source.

get_file(rpath, lpath, callback=<fsspec.callbacks.NoOpCallback object>, outfile=None, **kwargs)

Copy single remote file to local

get_mapper(root='', check=False, create=False, missing_exceptions=None)

Create key/value store based on this file-system

Makes a MutableMapping interface to the FS at the given root path. See fsspec.mapping.FSMap for further details.

glob(path, **kwargs)

Find files by glob-matching.

If the path ends with ‘/’ and does not contain “*”, it is essentially the same as ls(path), returning only files.

We support "**", "?" and "[..]". We do not support ^ for pattern negation.

Search path names that contain embedded characters special to this implementation of glob may not produce expected results; e.g., ‘foo/bar/starredfilename’.

kwargs are passed to ls.

head(path, size=1024)

Get the first size bytes from file

info(path, **kwargs)

Get metadata about a file from a head or list call.

Parameters
  • path (str) – URI of the directory/file

  • kwargs (dict) – additional args for OCI

invalidate_cache(path=None)

Deletes the filesystem cache.

Parameters

path (str (optional)) – The directory from which to clear. If not specificed, deleted entire cache.

isdir(path)

Is this entry directory-like?

isfile(path)

Is this entry file-like?

lexists(path, **kwargs)

If there is a file at the given path (including broken links)

listdir(path, detail=True, **kwargs)

Alias of AbstractFileSystem.ls.

ls(path: str, detail: bool = False, refresh: bool = False, **kwargs)

List single “directory” with or without details

Parameters
  • path (string/bytes) – location at which to list files

  • detail (bool (=False)) – if True, each list item is a dict of file properties; otherwise, returns list of filenames

  • refresh (bool (=False)) – if False, look in local cache for file details first

  • kwargs (dict) – additional arguments passed on

makedir(path, create_parents=True, **kwargs)

Alias of AbstractFileSystem.mkdir.

makedirs(path, exist_ok=False)

Recursively make directories

Creates directory at path and any intervening required directories. Raises exception if, for instance, the path already exists but is a file.

Parameters
  • path (str) – leaf directory name

  • exist_ok (bool (False)) – If False, will error if the target already exists

metadata(path, **kwargs)

Get metadata about a file from a head or list call.

Parameters
  • path (str) – URI of the directory/file

  • kwargs (dict) – additional args for OCI

mirror_sync_methods = False
mkdir(path: str, create_parents: bool = True, compartment_id: Optional[str] = None, **kwargs)

Make a new bucket or folder

Parameters
  • path (str) – URI of the directory

  • create_parents (bool (=True)) – If Ture, will create all nested dirs

  • compartment_id (str) – If the compartment to create the bucket is different from the compartment of your auth mechanism.

  • kwargs (dict) – additional args for OCI

mkdirs(path, exist_ok=False)

Alias of AbstractFileSystem.makedirs.

modified(path)

Return the modified timestamp of a file as a datetime.datetime

move(path1, path2, **kwargs)

Alias of AbstractFileSystem.mv.

mv(path1, path2, recursive=False, maxdepth=None, **kwargs)

Move file(s) from one location to another

open(path: str, mode: str = 'rb', block_size: Optional[int] = None, cache_options: Optional[dict] = None, compression: Optional[str] = None, cache_type: Optional[str] = None, autocommit: bool = True, **kwargs)

Open a file for reading or writing

Parameters
  • path (string) – Path of file on oci

  • mode (string) – One of ‘r’, ‘w’, ‘rb’, or ‘wb’. These have the same meaning as they do for the built-in open function.

  • block_size (int) – Size of data-node blocks if reading

  • cache_options (dict, optional) – Extra arguments to pass through to the cache.

  • compression (string or None) – If given, open file using compression codec. Can either be a compression name (a key in fsspec.compression.compr) or “infer” to guess the compression from the filename suffix.

  • cache_type (str) – Caching policy in read mode Valid types are: {“readahead”, “none”, “mmap”, “bytes”}, default “readahead”

  • autocommit (bool) – If True, the OCIFile will automatically commit the multipart upload when done

  • encoding (str) – The encoding to use if opening the file in text mode. The platform’s default text encoding is used if not given.

  • kwargs (dict-like) – Additional parameters used for oci methods. Typically used for ServerSideEncryption.

pipe(path, value=None, **kwargs)

Put value into path

(counterpart to cat)

Parameters
  • path (string or dict(str, bytes)) – If a string, a single remote location to put value bytes; if a dict, a mapping of {path: bytesvalue}.

  • value (bytes, optional) – If using a single path, these are the bytes to put there. Ignored if path is a dict

pipe_file(path, value, **kwargs)

Set the bytes of given file

protocol = ['oci']
put(lpath, rpath, recursive=False, callback=<fsspec.callbacks.NoOpCallback object>, **kwargs)

Copy file(s) from local.

Copies a specific file or tree of files (if recursive=True). If rpath ends with a “/”, it will be assumed to be a directory, and target files will go within.

Calls put_file for each source.

put_file(lpath, rpath, callback=<fsspec.callbacks.NoOpCallback object>, **kwargs)

Copy single file to remote

read_block(fn, offset, length, delimiter=None)

Read a block of bytes from

Starting at offset of the file, read length bytes. If delimiter is set then we ensure that the read starts and stops at delimiter boundaries that follow the locations offset and offset + length. If offset is zero then we start at zero. The bytestring returned WILL include the end delimiter string.

If offset+length is beyond the eof, reads to eof.

Parameters
  • fn (string) – Path to filename

  • offset (int) – Byte offset to start read

  • length (int) – Number of bytes to read

  • delimiter (bytes (optional)) – Ensure reading starts and stops at delimiter bytestring

Examples

>>> fs.read_block('data/file.csv', 0, 13)  
b'Alice, 100\nBo'
>>> fs.read_block('data/file.csv', 0, 13, delimiter=b'\n')  
b'Alice, 100\nBob, 200\n'

Use length=None to read to the end of the file. >>> fs.read_block(‘data/file.csv’, 0, None, delimiter=b’n’) # doctest: +SKIP b’Alice, 100nBob, 200nCharlie, 300’

See also

fsspec.utils.read_block()

read_bytes(path, start=None, end=None, **kwargs)

Alias of AbstractFileSystem.cat_file.

read_text(path, encoding=None, errors=None, newline=None, **kwargs)

Get the contents of the file as a string.

Parameters
  • path (str) – URL of file on this filesystems

  • newline (encoding, errors,) –

read_timeout = 15
rename(path1, path2, **kwargs)

Alias of AbstractFileSystem.mv.

rm(path, recursive=False, **kwargs)

Remove keys and/or bucket.

Parameters
  • path (str) – The location to remove.

  • recursive (bool (True)) – Whether to remove also all entries below, i.e., which are returned by walk().

rm_file(path)

Delete a file

rmdir(path, **kwargs)

Remove a directory, if empty

root_marker = ''
sep = '/'
sign(path, expiration=100, **kwargs)

Create a signed URL representing the given path

Some implementations allow temporary URLs to be generated, as a way of delegating credentials.

Parameters
  • path (str) – The path on the filesystem

  • expiration (int) – Number of seconds to enable the URL for (if supported)

Returns

URL – The signed URL

Return type

str

:raises NotImplementedError : if method is not implemented for a filesystem:

size(path)

Size in bytes of file

sizes(paths)

Size in bytes of each file in a list of paths

split_path(path, **kwargs)

Normalise OCI path string into bucket and key. :param path: Input path, like oci://mybucket@mynamespace/path/to/file :type path: string

Examples

>>> split_path("oci://mybucket@mynamespace/path/to/file")
['mybucket', 'mynamespace', 'path/to/file']
start_transaction()

Begin write transaction for deferring files, non-context version

stat(path, **kwargs)

Alias of AbstractFileSystem.info.

tail(path, size=1024)

Get the last size bytes from file

to_json()

JSON representation of this filesystem instance

Returns

str – protocol (text name of this class’s protocol, first one in case of multiple), args (positional args, usually empty), and all other kwargs as their own keys.

Return type

JSON structure with keys cls (the python location of this class),

touch(path: str, truncate: bool = True, data=None, **kwargs)

Create empty file or truncate

Parameters
  • path (string/bytes) – location at which to list files

  • truncate (bool (=True)) – if True, delete the existing file, replace with empty file

  • data (bool) – if provided, writes this content to the file

  • kwargs (dict) – additional arguments passed on

property transaction

A context within which files are committed together upon exit

Requires the file class to implement .commit() and .discard() for the normal and exception cases.

ukey(path)

Hash of file properties, to tell if it has changed

unstrip_protocol(name)

Format FS-specific path to generic, including protocol

upload(lpath, rpath, recursive=False, **kwargs)

Alias of AbstractFileSystem.put.

walk(path, maxdepth=None, **kwargs)

Return all files belows path

List all files, recursing into subdirectories; output is iterator-style, like os.walk(). For a simple list of files, find() is available.

Note that the “files” outputted will include anything that is not a directory, such as links.

Parameters
  • path (str) – Root to recurse into

  • maxdepth (int) – Maximum recursion depth. None means limitless, but not recommended on link-based file-systems.

  • topdown (bool (True)) – Whether to walk the directory tree from the top downwards or from the bottom upwards.

  • kwargs (passed to ls) –

write_bytes(path, value, **kwargs)

Alias of AbstractFileSystem.pipe_file.

write_text(path, value, encoding=None, errors=None, newline=None, **kwargs)

Write the text to the given file.

An existing file will be overwritten.

Parameters
  • path (str) – URL of file on this filesystems

  • value (str) – Text to write.

  • newline (encoding, errors,) –

OCIFile

class ocifs.core.OCIFile(fs: OCIFileSystem, path: str, mode: str = 'rb', block_size: int = 5242880, autocommit: bool = True, cache_type: str = 'bytes', cache_options: Optional[dict] = None, additional_kwargs: Optional[dict] = None, size: Optional[int] = None, **kwargs)

Bases: AbstractBufferedFile

Open OCI URI as a file.

This imitates the native python file object. Data is only loaded and cached on demand.

Parameters
  • fs (OCIFileSystem) – instance of FileSystem

  • path (str) – location in file-system

  • mode (str) – Normal file modes. Currently only ‘w’, ‘wb’, ‘r’ or ‘rb’.

  • block_size (int) – Buffer size for reading or writing, ‘default’ for class default

  • autocommit (bool) – Whether to write to final destination; may only impact what happens when file is being closed.

  • cache_type ({"readahead", "none", "mmap", "bytes"}, default "readahead") – Caching policy in read mode. See the definitions in core.

  • cache_options (dict) – Additional options passed to the constructor for the cache specified by cache_type.

  • size (int) – If given and in read mode, suppressed having to look up the file size

  • kwargs – Gets stored as self.kwargs

DEFAULT_BLOCK_SIZE = 5242880
MAXIMUM_BLOCK_SIZE = 5368709120
MINIMUM_BLOCK_SIZE = 5242880
close()

Close file

Finalizes writes, discards cache

property closed
commit(**kwargs)

Move from temp to final destination

property details
discard()

Throw away temporary file

fileno()

Returns underlying file descriptor if one exists.

OSError is raised if the IO object does not use a file descriptor.

flush(force=False)

Write buffered data to backend store.

Writes the current buffer, if it is larger than the block-size, or if the file is being closed.

Parameters

force (bool) – When closing, write the last block even if it is smaller than blocks are allowed to be. Disallows further writing to this file.

property full_name
info()

File information about this path

isatty()

Return whether this is an ‘interactive’ stream.

Return False if it can’t be determined.

read(length=-1)

Return data from cache, or fetch pieces as necessary

Parameters

length (int (-1)) – Number of bytes to read; if <0, all remaining bytes.

readable()

Whether opened for reading

readinto(b)

mirrors builtin file’s readinto method

https://docs.python.org/3/library/io.html#io.RawIOBase.readinto

readinto1(b)
readline()

Read until first occurrence of newline character

Note that, because of character encoding, this is not necessarily a true line ending.

readlines()

Return all data, split by the newline character

readuntil(char=b'\n', blocks=None)

Return data between current position and first occurrence of char

char is included in the output, except if the end of the tile is encountered first.

Parameters
  • char (bytes) – Thing to find

  • blocks (None or int) – How much to read in each go. Defaults to file blocksize - which may mean a new read on every call.

retries = 5
seek(loc, whence=0)

Set current file location

Parameters
  • loc (int) – byte location

  • whence ({0, 1, 2}) – from start of file, current location or end of file, resp.

seekable()

Whether is seekable (only in read mode)

tell()

Current file location

truncate()

Truncate file to size bytes.

File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Returns the new size.

writable()

Whether opened for writing

write(data)

Write data to buffer.

Buffer only sent on flush() or if buffer is greater than or equal to blocksize.

Parameters

data (bytes) – Set of bytes to be written.

writelines(lines, /)

Write a list of lines to stream.

Line separators are not added, so it is usual for each of the lines provided to have a line separator at the end.