OpenZGY/Python API and Internals (ALPHA)
Access seismic data stored in ZGY format.
Public Member Functions | Static Public Member Functions | List of all members
openzgy.impl.compress.CompressPlugin Class Reference
Inheritance diagram for openzgy.impl.compress.CompressPlugin:
openzgy.impl.zfp_compress.ZfpCompressPlugin

Public Member Functions

def __init__ (self, *args, **kwargs)
 
def __call__ (self, data)
 
def dump (*args, **kwargs)
 

Static Public Member Functions

def compress (data, *args, **kwargs)
 
def decompress (cdata, status, shape, file_dtype, user_dtype)
 

Detailed Description

Base class for OpenZGY compression plug-ins.
If anybody wants to add additional compression algorithms it
is recommended but not required to use this base class. See
CompressFactoryImpl.register{Compressor,Decompressor} for how to
use plain functors (C++) or callables (Python) instead.

This class performs triple duty as it handles both compression
and decompression static methods (need not have been together)
and an instance of the class can be used a compressor functor
if a lambda is too limiting. To invoke the methods:

    MyCompressPlugin.factory(...)(data)
    MyCompressPlugin.compress(data, ...) (NOT recommended)
    MyCompressPlugin.decompress(cdata,status,shape,file_dtype,user_dtype)

The following will also work but should only be used for very simple
compressors that have no parameters. In the first case MyCompressPlugin
won't have the option to return None for certain parameters, and in the
second case handling a variable arguent list becomes trickier.

To register this class:
  CompressFactoryImpl.registerCompressor("My",MyCompressPlugin.factory)
  CompressFactoryImpl.registerDecompressor("My",MyCompressPlugin.decompress)

To use the compression part from client code:
   compressor = ZgyCompressFactory("My", ...)

Constructor & Destructor Documentation

◆ __init__()

def openzgy.impl.compress.CompressPlugin.__init__ (   self,
args,
**  kwargs 
)
Create an instance that remembers the arguments it was created with.
When the instance is called as a function it will invoke compress()
with those arguments. So you can use either of the following:

    compressor = CompressPlugin.compress # no arguments
    compressor = CompressPlugin(...)
    compressor = lambda x: CompressPlugin.compress(x, ...)

Derived classes don't need to redefine __init__ and __call__.
But they might want to in order to get argument checking.
The __init__ in the base class accepts any arguments so an error
won't be caught until the first time the compressor is invoked.

Member Function Documentation

◆ __call__()

def openzgy.impl.compress.CompressPlugin.__call__ (   self,
  data 
)
Invoke the compressor with arguments passed by the constructor.

Reimplemented in openzgy.impl.zfp_compress.ZfpCompressPlugin.

◆ compress()

def openzgy.impl.compress.CompressPlugin.compress (   data,
args,
**  kwargs 
)
static
This is an abstract method.

Compress a 3d or (TODO-Low 2d) numpy array, returning a bytes-like
result. If called with a single "data" argument the compression
will be done with default parameters and no extended logging.

Additional arguments are specific to the compression type.

The function can be used directly as the compression hook.
But you probably want a lambda expression or a real instance
of this class instead, to be able to specify parameters.

The compression algorithm is used is assumed to handle big / little
endian conversion itself. TODO-Worry this is not quite true for ZFP.
See the documentation. A special compilation flag is needed
on big endian machines. Also I suspect the optional hedaer
(which this code uses) might need byte swapping.

◆ decompress()

def openzgy.impl.compress.CompressPlugin.decompress (   cdata,
  status,
  shape,
  file_dtype,
  user_dtype 
)
static
This is an abstract method.

Decompress bytes or similar into a numpy.ndarray.

Arguments:
  cdata      -- bytes or bytes-like compressed data,
        possibly with trailing garbage.
  status     -- Currently always BrickStatus.Compressed,
        in the future the status might be used to
        distinguish between different compression
        algorithms instead of relying on magic numbers.
  shape      -- Rank and size of the result in case this is
        not encoded by the compression algorithm.
  file_dtype -- Original value type before compression,
        in case the decompressor cannot figure it out.
        This will exactly match the dtype of the
        data buffer passed to the compressor.
  user_dtype -- Required value type of returned array.

Passing an uncompressed brick to this function is an error.
We don't have enough context to handle uncompressed bricks
that might require byteswapping and fix for legacy quirks.
Also cannot handle constant bricks, missing bricks, etc.

The reason user_dtype is needed is to avoid additional
quantization noise when the user requests integer compressed data
to be read as float. the decompressor might need to convert
float data to int, only to have it converted back to float later.

Current assumptions made of all candidate algorithms:

    -  The compressed data stream may have trailing garbage;
       this will be silently ignored by the decompressor.

    -  The compressed data stream will never be longer than
       the uncompressed data. This needs to be enforced by
       the compressor. The compressor is allowed to give up
       and tell the caller to not compress this brick.

    -  The reason for the two assumptions above is an
       implementation detail; the reported size of a
       compressed brick is not completely reliable.
       This might change in the next version

    -  The compressed data stream must start with a magic
       number so the decompressor can figure out whether
       this is the correct algorithm to use.

If the assumptions cannot be met, the compressor / decompressor
for this particular type could be modified to add an extra header
with the compressed size and a magic number. Or we might add a
(size, algorithm number) header to every compressed block to
relieve the specific compressor / decompressor from worrying
about this. Or the brick status could be used to encode which
algorithm was used, picked up from the MSB of the lup entry.
Which would also require the compressor to return both the
actual compressed data and the code to identify the decompressor.
That is the main reason we are also passed the "status" arg.
Caveat: If adding an extra header, keep in mind that this header
must be included when checking that the compressed stream is not
too big.

Reimplemented in openzgy.impl.zfp_compress.ZfpCompressPlugin.

◆ dump()

def openzgy.impl.compress.CompressPlugin.dump ( args,
**  kwargs 
)
Output statistics to standard output, if possible.

The documentation for this class was generated from the following file: