OpenZGY/C++ API and Internals (ALPHA)
Access seismic data stored in ZGY format.
Public Types | Public Member Functions | Static Public Member Functions | List of all members
InternalZGY::DataBuffer Class Referenceabstract

Each DataBuffer instance represents some in memory data. More...

#include <databuffer.h>

Inheritance diagram for InternalZGY::DataBuffer:
InternalZGY::DataBufferNd< T, NDim >

Public Types

enum  flags_t { C_ORDERING =1, CONTIGUOUS =2, POSITIVE_STRIDE =4, DEGENERATE =8 }
 

Public Member Functions

 DataBuffer (const DataBuffer &)=delete
 
DataBufferoperator= (const DataBuffer &)=delete
 
virtual std::string toString () const =0
 
virtual std::shared_ptr< void > voidData ()=0
 
virtual std::shared_ptr< const void > voidData () const =0
 
virtual bool contiguous () const =0
 
virtual std::int64_t allocsize () const =0
 
virtual std::int64_t totalsize () const =0
 
virtual std::int64_t itemsize () const =0
 
virtual const std::int64_t * sizeptr () const =0
 
virtual const std::int64_t * strideptr () const =0
 
virtual std::array< std::int64_t, 3 > size3d () const =0
 
virtual std::array< std::int64_t, 3 > stride3d () const =0
 
virtual bool ownsdata () const =0
 
virtual bool isScalar () const =0
 
virtual bool isAllSame (const std::int64_t *used_in) const =0
 
virtual double scalarAsDouble () const =0
 
virtual RawDataType datatype () const =0
 
virtual void fill (double value)=0
 
virtual void clear ()=0
 
virtual std::pair< double, double > range () const =0
 
virtual void copyFrom (const DataBuffer *src, const std::int64_t *srcorig, const std::int64_t *dstorig, const std::int64_t *cpyorig, const std::int64_t *cpysize)=0
 
virtual std::shared_ptr< DataBufferclone () const =0
 
virtual std::shared_ptr< DataBufferscaleToFloat (const std::array< double, 2 > &)=0
 
virtual std::shared_ptr< DataBufferscaleToStorage (const std::array< double, 2 > &, RawDataType)=0
 
virtual std::uint32_t layout () const =0
 
virtual bool is_cstride () const =0
 
virtual void check_cstride () const =0
 
virtual std::shared_ptr< DataBufferslice1 (int dim, std::int64_t start, std::int64_t size) const =0
 

Static Public Member Functions

static std::shared_ptr< DataBuffermakeDataBuffer3d (void *raw, std::int64_t nbytes, const std::array< std::int64_t, 3 > &size, RawDataType dtype)
 

Detailed Description

Each DataBuffer instance represents some in memory data.

Warning, creeping features. The class starter out as a trivial pointer + 3d size struct but is growing dangerously close to emulating a Python numpy.ndarray.

This data type represents an unsafe buffer pointer, plus:

There is no knowledge of where in the survey this data belongs.

There is no support for views covering just part of a buffer, and strides must all be positive. It is possible to support those scenarios but let's not put in bells and whistles before we see they are needed.

DataBuffer is a non-templated base class to allow passing buffers around without needing to have the calling code templated as well.

copyFrom() is somewhat unsafe. It can verify (using dynamic_cast) that "src" is of the same type as "this". But srcorig etc. are passed as dumb pointers so the code cannot know whether they are of the correct size. Maybe I am being too general here; maybe the number of dimensions should always be 3?

The actual bulk data is stored as a std::shared_ptr<T> which means it can be passed around independantly of the DataBuffer instance. The voidData() method returns the actual smart pointer as a std::shared_ptr<void> which can be downcast to the correct type. the data() method in the templated leaf type returns the raw pointer. It is possible to also have methods returning void* and std::shared_ptr<T> but don't add those unless actually used.

Constructors that expect an external buffer are required to provide it as a smart pointer. The DataBuffer will share ownership with the caller. There is nothing that prevents the callers from passing a shared_ptr with a no-op deleter, effectively removing reference counting. But that will of course void their warranty. And make the callers 100% responsible for the data being valid long enough.

Unsafe external buffers are currently used for read() and write() requests from the application code and for delivery from the file back end. Reference counted external buffers are used when the buffer was built from decompressed data.

Each of those scenarios should be ok since no buffers should be held onto after the functions return. TODO-Worry: Could async read requests get delivered after the call to read() got aborted via an exception? TODO-Worry: Might somebody decide to hold on to a buffer in order to implement delayed write? In 99% of cases the application's data buffer needs to be copied into a properly reference counted buffer anyway. Maybe make that 100% by not allowing short cuts when the application writes exactly one brick. Another problem is if the writer decides to copy out brick at a time into a reusable one brick buffer. This could also lead to the user's buffer being held on to longer. Maybe make sure all bricks are copied up front before any writing starts. This might also simplify parallelized compression.

As an alternative to the above, I have considered a scheme that makes less use of smart pointers so I can reduce the overhead they cause. Beware of premature optimization though. And more things that could go wrong with buffers freed early. The changes would be:

Other changes I have considered, orthogonal to the above:

TODO-Low decide on support for strided data.

It is easy enough for this class to allow arbitrary strides. Even strides that don't make sense. The problem is that code accessing the data would like to make assumptions about the stride so the code becomes simpler and also simpler to test. The current situation is unclear. Some code is flexible but it might not be possible to use that flexibility due to other code making assumptions.

The more support there is for handling strides, the more opportunities exist to use views on a buffer instead of copying them.

Member Function Documentation

◆ clone()

virtual std::shared_ptr<DataBuffer> InternalZGY::DataBuffer::clone ( ) const
pure virtual

Deep copy of a DataBuffer

Implemented in InternalZGY::DataBufferNd< T, NDim >.

◆ copyFrom()

virtual void InternalZGY::DataBuffer::copyFrom ( const DataBuffer src,
const std::int64_t *  srcorig,
const std::int64_t *  dstorig,
const std::int64_t *  cpyorig,
const std::int64_t *  cpysize 
)
pure virtual

Corresponds to openzgy.impl._partialCopy(). To make templates work better the C++ version is an instance method with 'this' as destination.

Implemented in InternalZGY::DataBufferNd< T, NDim >.

◆ makeDataBuffer3d()

std::shared_ptr< DataBuffer > InternalZGY::DataBuffer::makeDataBuffer3d ( void *  raw,
std::int64_t  nbytes,
const std::array< std::int64_t, 3 > &  size,
RawDataType  dtype 
)
static

Convert voidptr + nbytes + size3d to a DataBuffer.

To avoid too much copy/paste, this method this able to create several different buffer types by calling different DataBufferNd constructors.

If voidptr is null this creates an uninitialized 3d DataBuffer with the given size and type and isScalar() == false. nbytes should be 0.

If voidptr is not null then this creates a scalar DataBuffer if nbytes indicates there is room for just one sample. If there are enough bytes it creates a regular DataBuffer pointing to the voidptr that was passed in.

The valid values for nbytes are:

0 - If and only if voidptr is null. sizeof(T) - (T*)voidptr contains the scalar to use. sizeof(double) - (double*)voidptr contains the scalar to use, size[0]*size[1]*size[2]*sizeof(T) - (T*)voidptr is used as-is.

Any other value will raise an exception.

Technically the scheme is ambiguous if sizeof(T) is 8 bytes and size indicates less than 8 bytes. Cannot know for sure whether the buffer points to a double or to 8/4/2/1 element of T. This should never happen in practice. If worried, remove the "double" feature or introduce a new is_double flag.

The scheme also cannot distinguish between a scaler size 1x1x1 and a regular buffer size 1x1x1. But those two are practically equivalent.

Caveat: If the data pointer is not correctly aligned to the type of data then the returned DataBuffer will also have a misaligned pointer.

Caveat: Since the user can pass in a void* buffer, this will not be reference counted and could go out of scope beore the databuffer does. This might change. To mitigate this somehow, if the code knows the buffer is no longer valid it can call databuffer.clear().

◆ scaleToFloat()

virtual std::shared_ptr<DataBuffer> InternalZGY::DataBuffer::scaleToFloat ( const std::array< double, 2 > &  )
pure virtual

Convert DataBufferNd<T,N> to DataBufferNd<float,N>, hiding leaf types.

Implemented in InternalZGY::DataBufferNd< T, NDim >.

◆ scaleToStorage()

virtual std::shared_ptr<DataBuffer> InternalZGY::DataBuffer::scaleToStorage ( const std::array< double, 2 > &  ,
RawDataType   
)
pure virtual

Convert DataBufferNd<float,N> to DataBufferNd<T,N>, hiding leaf types.

Implemented in InternalZGY::DataBufferNd< T, NDim >.


The documentation for this class was generated from the following files: