[
- Ingestion Workflow using Excel DAG
This document guides users who want to leverage the preregistered Excel Directed Acyclic Graph (DAG) and have their data ingested from Excel files. It covers the existing Excel DAG capability, defines a way to trigger the DAG, and describes how to push the input Excel files for ingestion.
The OSDU Data Platform R3 ingestion framework is built to ingest various types of data files, such as Excel, CSV, LAS, DLIS, and so on, with Apache Airflow, an open-source platform to author, schedule, and monitor the workflows. Workflows in Airflow are collections of tasks that have directional dependencies. Specifically, Airflow uses a DAG to represent a workflow. At a high level, a DAG can be thought of as a container that holds tasks and their dependencies, and it sets the context for when and how those tasks should be executed. There are various components in the framework which enable uploading a file, creating the required schemas for ingestion, or invoking the ingestors. Each is explained in detail in sections that follow. The Ingestion framework comes with some pre-registered DAGs. These are available out of the box and can be used by used to execute specific workflows. Some of these pre-registered DAGs are for Excel, CSV, Shapefile, LAS, and document ingestion workflows.
File service – This service facilitates the management of files on the data platform. The File service provides the uploading, secure discovery, and downloading of files.
Storage service – This JSON object store facilitates the storage of metadata information for domain entities. It also raises storage events when records are saved using the Storage service.
Notification service – Consumers use this service to subscribe to storage events. Use the Register service to register the subscription.
Workflow service – This service facilitates the management of workflows on the data platform by providing a wrapper around the workflow engine to abstract some of the technical nuances of the workflow engine from consumers.
Airflow engine – This service is the heart of the ingestion framework and acts as a workflow orchestrator.
DAGs – This service is based on Direct Acyclic Graph concept and represent workflows that are authored, orchestrated, managed, and monitored by the workflow engine.

Access required:
# Service level group Service API Operations 1 service.workflow.viewer List Custom Operators 2 service.workflow.creator Register DAG 3 service.workflow.admin Register Custom Operator, Trigger Workflow, Data sharing between two tasks in ONE workflow (SAS URL) 4 service.file.viewers & service.file.editors To upload and download file and create file meta Data 5 service.legal.editor & service.legal.user Create and Retrieve Legal Tags 6 service.schema-service.editors, service.schema-service.viewers & service.schema-service.admin To create, update, and view the schema 7 service.storage.admin, service.storage.creator & service.storage.viewer To create, update, and view records in storage 8 service.search.user & service.search.admin To search records

Use the Schema service to register your schema that describes the data model for each Excel sheet. Refer to the Schema service tutorial and API Reference
A signed URL is required in order to upload the Excel file to cloud storage.
GET Request: /file/v2/files/uploadURL
Response:
Sample Response
{ "FileID": "0bc2ad23cecb4275916089280da336ca", "Location": { "SignedURL": "https://osdur3mvpdp1qadizpdata.blob.core.windows.net/file-staging-area/osdu-user%2F1611218353490-2021-01-21-08-39-13-490%2F0bc2ad23cecb4275916089280da336ca?sv=2019-12-12&se=2021-01-28T08%3A39%3A14Z&sr=b&sp=cw&sig=y3%2FVHfeCyotI5bJU9446vHnpbIIchtrsB40szoj0QMM%3D", "FileSource": "/osdu-user/1611218353490-2021-01-21-08-39-13-490/0bc2ad23cecb4275916089280da336ca" } }- We refer to this response in next step. The SignedURL attribute is used to upload the file in the next step (3), and the FileSource is used in step 5 to run workflow.
Upload the file using the signed URL obtained in the previous step. Use a PUT request to pass the input file. The file is uploaded to cloud storage.
- PUT Request: Use the SignedURL value from previous step
- HTTP Headers:
'x-ms-blob-type: BlockBlob''content-type: text/Excel'
- Execute the PUT request by adding the file as a binary request.
Use the File service to create metadata. The metadata enables the discovery of the file and secure downloads. It also provides a mechanism to query information associated with the file that is required during its processing. The file metadata used to trigger the excel DAG workflow must contain information for each individual sheet in the excel file. For more information, refer to the File Metadata Record section.
API reference
POST request: /file/v2/files/metadata
Payload:
Set the metadata
data.Filesourceparameter to be the data.DatasetProperties.FileSourceInfo.FileSource value obtained from step 2.Sample request
{ "data": { "Endian": "BIG", "Description": "string", "DatasetProperties": { "FileSourceInfo": { "FileSource": "/osdu-user/1640852899353-2021-12-30-08-28-19-353/723fb105ee84494999e1a3dfaa4004e8", "PreLoadFilePath": "string", "PreloadFileCreateUser": "string", "PreloadFileModifyUser": "string", "PreloadFileModifyDate": "string", "Name": "NZ-Amokura_Sample.xlsx", "FileSize": "string", "EncodingFormatTypeID": "string" } }, "ResourceLifecycleStatus": "string", "TotalSize": "string", "ResourceCurationStatus": "string", "EncodingFormatTypeID": "string", "Source": "string", "Name": "NZ-Amokura.xlsx", "ResourceHomeRegionID": "string", "ResourceHostRegionIDs": [ "string" ], "ExtensionProperties": { "Classification": "string", "Description": "string", "ExternalIds": [ "string" ], "FileDateCreated": {}, "FileDateModified": {}, "FileContentsDetails": { "excelSheetsMetadata": [ { "excelSheetName": "field", "TargetKind": "<authority>:test:field:1.0.0", "nestedFieldDelimiter": ".", "FrameOfReference": [ { "kind": "CRS", "name": "GCS_WGS_1984", "persistableReference": "{\"wkt\":\"GEOGCS[\\\"GCS_WGS_1984\\\",DATUM[\\\"D_WGS_1984\\\",SPHEROID[\\\"WGS_1984\\\",6378137.0,298.257223563]],PRIMEM[\\\"Greenwich\\\",0.0],UNIT[\\\"Degree\\\",0.0174532925199433],AUTHORITY[\\\"EPSG\\\",4326]]\",\"ver\":\"PE_10_3_1\",\"name\":\"GCS_WGS_1984\",\"authCode\":{\"auth\":\"EPSG\",\"code\":\"4326\"},\"type\":\"LBC\"}", "propertyNames": [ "Location Id" ], "propertyValues": [ "deg" ], "uncertainty": 0 }, { "kind": "DateTime", "persistableReference": "{\"type\": \"DAT\", \"format\": \"MM-dd-yyyy\"}", "name": "date", "propertyNames": [ "Discovery Date" ], "propertyValues": [ "Discovery Date" ], "uncertainty": 0 } ], "relationships": [ { "project": { "ids": [ "<authority>:testSource:project-sxs0f1a5219-4640-50af-9f63:", "<authority>:testSource:project-dsv05f45665-5885-40ad-d9m3:" ] } } ], "acl": { "viewers": [ "data.default.viewers@{domain}.com" ], "owners": [ "data.default.viewers@{domain}.com" ] }, "legal": { "legaltags": [ "<valid legal tag>" ], "otherRelevantDataCountries": [ "US" ], "status": "compliant" } }, { "excelSheetName": "wellbore", "TargetKind": "<authority>:test:wellbore:1.0.0", "nestedFieldDelimiter": ".", "SpatialMapping": { "type": "point", "latitude": "LATITUDE", "longitude": "LONGUITUDE" }, "FrameOfReference": [ { "kind": "CRS", "name": "GCS_WGS_1984", "persistableReference": "{\"wkt\":\"GEOGCS[\\\"GCS_WGS_1984\\\",DATUM[\\\"D_WGS_1984\\\",SPHEROID[\\\"WGS_1984\\\",6378137.0,298.257223563]],PRIMEM[\\\"Greenwich\\\",0.0],UNIT[\\\"Degree\\\",0.0174532925199433],AUTHORITY[\\\"EPSG\\\",4326]]\",\"ver\":\"PE_10_3_1\",\"name\":\"GCS_WGS_1984\",\"authCode\":{\"auth\":\"EPSG\",\"code\":\"4326\"},\"type\":\"LBC\"}", "propertyNames": [ "LATITUDE", "LONGUITUDE" ], "propertyValues": [ "deg" ], "uncertainty": 0 }, { "kind": "DateTime", "persistableReference": "{\"type\": \"DAT\", \"format\": \"MM-dd-yyyy\"}", "name": "date", "propertyNames": [ "STATUS_DATE", "SPUD_DATE" ], "propertyValues": [ "STATUS_DATE", "SPUD_DATE" ], "uncertainty": 0 }, { "kind": "Unit", "name": "ft", "persistableReference": "{\"scaleOffset\":{\"scale\":0.3048,\"offset\":0.0},\"symbol\":\"ft\",\"baseMeasurement\":{\"ancestry\":\"Length\",\"type\":\"UM\"},\"type\":\"USO\"}", "propertyNames": [ "md", "tvd", "elevation" ], "propertyValues": [ "ft" ], "uncertainty": 0 } ], "relationships": [ { "project": { "ids": [ "<authority>:testSource:project-sxs0f1a5219-4640-50af-9f63:", "<authority>:testSource:project-dsv05f45665-5885-40ad-d9m3:" ] } } ], "relatedNaturalKey": [ { "well": { "targetKind": "<authority>:testSource:well:1.0.0", "keys": [ { "sourceColumn": "UBHI", "targetAttribute": "uwi" } ] } } ] } ] } }, "ResourceSecurityClassification": "string", "ExistenceKind": "string", "SchemaFormatTypeID": "string" }, "meta": [ {} ], "id": "<authority>:dataset--File.Generic:acf7cf1f-e396-4075-a381-bca3117c5eab", "version": 1640855792471113, "kind": "<authority>:wks:dataset--File.Generic:1.0.0", "acl": { "viewers": [ "data.default.viewers@{domain}.com" ], "owners": [ "data.default.owners@{domain}.com" ] }, "legal": { "legaltags": [ "<valid legal tag>" ], "otherRelevantDataCountries": [ "US" ], "status": "compliant" } }Sample Response
{ "id": "<authority>:file:4cedbed0-837b-4797-9683-f088a7a15014" }
This is an explicit way to trigger the execution of an ingestion workflow using the Workflow service workflowRun endpoint.
API reference: Workflow service
POST request: /workflow/
<workflow_name>/workflowRunExcel_parser_wfis the default Excel ingestionworkflow_namedefined by the ingestion framework. See all standard Excel workflow names here. Create custom workflows using the Workflow service.Payload:
Sample Request Payload
{ "executionContext": { "id": "<file metadata id obtained in step 4 >", "dataPartitionId": "<dataPartition>" }, "runId": "Optional. Run ID can be set explicitly by the user. If not set, the system will generate a run ID for the workflow run." }Sample Response Payload
{ "workflowId": "excel_ingestor_wf", "runId": "06f9c92f-e302-4987-9c69-a62abcc7e0cd", "startTimeStamp": 1617191215379, "status": "submitted", "submittedBy": "<user>@<domain>.com" }Error Codes
| Code | Message | | ---- | --------------------------------------------- | | 400 | Bad Request. Partition ID is missing/invalid. | | 401 | Unknown or Invalid user. | | 403 | User forbidden from accessing this API. | | 404 | Not Found. Workflow Doesn't exist. |
You can monitor the execution status of the workflow using the Monitor Workflow Status API. The workflow can have these status codes:
| Code | Message |
|---|---|
| SUBMITTED | Workflow trigger accepted by platform, but actual workflow run has not started. |
| RUNNING | Workflow in progress or under execution. |
| FAILED | Workflow execution completed with failures. |
| SUCCESS | Workflow execution completed successfully. |
After the execution begins, monitor the execution status of the workflow using the WorkflowsService. From step 5, use the same workflow_name and runId from the response.
API reference: Workflow service
GET request: /workflow/
<workflow_name>/workflowRun/<runId>Sample Response Payload
{ "workflowId": "excel_ingestor_wf", "runId": "06f9c92f-e302-4987-9c69-a62abcc7e0cd", "startTimeStamp": 1614579997265, "endTimeStamp": 1614607532205, "status": "success", "submittedBy": "<user>@<domain>.com", }
Search the ingested records using the Search service POST /query_with_cursor endpoint.
POST request: /query_with_cursor
Payload:
Sample Request Payload
{ "kind": "<metadata target kind>", "query": "<search query>" }Sample Response Payload
{ "cursor": "<cursor_id>", "results": [ { "data": {}, "kind": "<kind>", "id": "<record_id>" } ], "totalCount": 1 }
You can get the details of the ingestion workflow by using Status Processor service POST /v1/status/query endpoint. You can find the correlationId in the response header of the triggered workflow run. The correlationId parameter is a mandatory with which you can find the details of the ingested records.
POST request: /v1/status/query
Payload:
Sample Request Payload
{ "statusQuery": { "correlationId": "8b1d3559-d82e-46c5-9385-f1fa9f3a49cf", "stage": [ "INGESTOR" ] }, "limit": 1000 }Sample Response Payload
{ "results": [ { "correlationId": "8b1d3559-d82e-46c5-9385-f1fa9f3a49cf", "recordId": "393f02a4-5386-46ce-ab21-ce17415cfbc6", "stage": "INGESTOR", "status": "SUBMITTED", "message": "Workflow run submitted. ", "errorCode": 0, "userEmail": user@slb.com, "timestamp": 1640855809025 }, { "correlationId": "8b1d3559-d82e-46c5-9385-f1fa9f3a49cf", "recordId": "393f02a4-5386-46ce-ab21-ce17415cfbc6", "stage": "INGESTOR", "status": "IN_PROGRESS", "message": "Workflow execution started", "errorCode": 0, "userEmail": user@slb.com, "timestamp": 1640855854 }, { "correlationId": "8b1d3559-d82e-46c5-9385-f1fa9f3a49cf", "recordId": "393f02a4-5386-46ce-ab21-ce17415cfbc6", "stage": "INGESTOR", "status": "SUCCESS", "message": "Workflow execution successful", "errorCode": 0, "userEmail": user@slb.com, "timestamp": 1640856060 } ], "count": 3, "totalCount": 3, "limit": 1000 }Status of records:
Status of records for Payload
{ "statusQuery": { "correlationId": "8b1d3559-d82e-46c5-9385-f1fa9f3a49cf", "stage": [ "INGESTOR_SYNC" ] }, "limit": 1000 }Status of records for Response
{ "results": [ { "correlationId": "8b1d3559-d82e-46c5-9385-f1fa9f3a49cf", "recordId": "opendes:wellbore:test-U1A0MzU4NDQ5MjEyODgx", "stage": "INGESTOR_SYNC", "status": "SUCCESS", "message": "Record ingestion successful.", "errorCode": 0, "userEmail": "9343be92-f16a-4137-92a6-e414b16caf11", "timestamp": 1640856023613 }, { "correlationId": "8b1d3559-d82e-46c5-9385-f1fa9f3a49cf", "recordId": "opendes:wellbore:test-U1A1NjA3MDM2Mzk5MzUx", "stage": "INGESTOR_SYNC", "status": "SUCCESS", "message": "Record ingestion successful.", "errorCode": 0, "userEmail": "9343be92-f16a-4137-92a6-e414b16caf11", "timestamp": 1640856023817 }, { "correlationId": "8b1d3559-d82e-46c5-9385-f1fa9f3a49cf", "recordId": "opendes:field:test-MTI1OTI2QW1va3VyYQ", "stage": "INGESTOR_SYNC", "status": "SUCCESS", "message": "Record ingestion successful.", "errorCode": 0, "userEmail": "9343be92-f16a-4137-92a6-e414b16caf11", "timestamp": 1640856030218 } ], "count": 3, "totalCount": 3, "limit": 1000 }
| Stage | Message |
|---|---|
| INGESTOR | Job-level messages. |
| INGESTOR_SYNC | Record-level messages. |
| Status | Message |
|---|---|
| SUBMITTED | Workflow trigger accepted by platform, but actual workflow run has not started. |
| IN_PROGRESS | Workflow in progress or under execution. |
| FAILED | Workflow execution completed with failures. |
| SUCCESS | Workflow execution completed successfully. |
Validate ingested data using the Storage service.
- API Reference
- GET request: /storage/v2/records/
<record_id> - The
record_idparameter is the ID of the ingested record received, for example, in the response of the above search query. - The response shows the ingested data.
Excel file:
- The Excel DAG supports data only in tabular format (tables).
- The first row of each sheet of the Excel file is the header which represents the attributes of the ingested data. To maintain full data fidelity, the sheet headers must match 'ONE to ONE' with the raw schema that you specified.
- To support the ingestion of data from Excel files into nested schemas, you can specify a delimiter character (as described on the File Metadata record section) to define a nested level on the sheet header. For example, if the
nestedFieldDelimiterin the file metadata is set as., a nested attribute described in the Excel header should look similar toelevationReference.elevationFromMsl.unitKey. - To support ingestion of data from Excel files into schemas with nested arrays, you can specify a delimiter character (as described on the File Metadata record section) to define nested array levels on the sheet header. For example, if the
nestedFieldDelimiterin the file metadata is set as., a nested attribute described in the sheet header should look similar toTerms.[0].ObligationTypeID,Terms.[1].ObligationTypeID. Here[]is used to represent arrays and the use of[]in the attribute's name should be avoided. - From the second row onwards, each row represents a data record to ingest.
- If the target schema of each Excel sheet has attributes tagged as "x-osdu-natural-key", then they are used to generate the
record id. For example, a combination of UWI+WELLNAME acting as a natural key will result in an ID similar to<authority>:<source>:<entitytype>-{64bit encoded value of the UWI+WELLNAME defined in the Excel sheet}. If the target schema does not have any attributes tagged as "x-osdu-natural-key" or if the natural key attribute values are missing in the Excel sheet, the ID will be generated by the Storage service. The "x-osdu-natural-key" index must start at 1. - To establish relationships, the key attributes of the parent records, described in its metadata file block, must be present in each equivalent sheet of the Excel file.
- This sample Excel file ingests two wellbores and one field.
- Sample Excel file: Sample.xlsx
Schema service
- Use the Schema service to map the attributes and their kinds for each sheet of the Excel file. The Schema kind should be unique; construct it using the attributes under schemaIdentity. The basic format is
<authority>:<source>:<entityType>:<version> - API Reference
- A sample schema for field and wellbore.
Sample file of schema for sheet field: Field_Schema.json
{ "license": "Copyright 2017-2020, Schlumberger\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\nhttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n", "$schema": "http://json-schema.org/draft-07/schema#", "description": "The wellbore schema. Used to capture the general information about a wellbore. This information is sometimes called a \"wellbore header\". A wellbore represents the path from surface to a unique bottomhole location. The wellbore object is uniquely identified within the context of one well object.", "id": "https://slb-swt.visualstudio.com/data-management/Ingestion%20Services/_git/wke-schema?path=%2Fdomains%2Fwell%2Fjson_schema%2Fslb_wke_wellbore.json&version=GBmaster", "title": "Wellbore", "type": "object", "definitions": { "FieldData": { "description": "The domain specific data container for a wellbore.", "SpatialLocation": { "type": "object", "description": "The spatial location information such as coordinates, CRS information.", "$ref": "#/definitions/osdu:wks:AbstractSpatialLocation:1.0.0" }, "title": "Wellbore Data", "type": "object", "properties": { "Location Id": { "description": "Location Id", "title": "Location Id", "type": "number", "example": "27.85011793" }, "Field Name": { "description": "Field Name", "title": "Field Name", "type": "string", "x-osdu-natural-key": 2, "example": "String" }, "Id": { "description": "The unique field identifier", "title": "Unique field Identifier", "type": "string", "x-osdu-natural-key": 1, "example": [ "125926" ] }, "Source": { "description": "Source", "title": "Source", "type": "string", "example": "NZ" } }, "$id": "definitions/FieldeData" }, "accessControlList": { "description": "The access control tags associated with this entity.", "title": "Access Control List", "type": "object", "properties": { "viewers": { "description": "The data viewers group specification.", "title": "Owners", "type": "array", "items": { "type": "string" }, "example": [ "data.default.viewers@{domain}.com" ] }, "owners": { "description": "The data owner group specification.", "title": "Owners", "type": "array", "items": { "type": "string" }, "example": [ "data.default.owners@{domain}.com" ] } } }, "osdu:wks:AbstractFeatureCollection:1.0.0": { "x-osdu-inheriting-from-kind": [], "x-osdu-license": "Copyright 2021, The Open Group \\nLicensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 . Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.", "$schema": "http://json-schema.org/draft-07/schema#", "x-osdu-schema-source": "osdu:wks:AbstractFeatureCollection:1.0.0", "description": "GeoJSON feature collection as originally published in https://geojson.org/schema/FeatureCollection.json. Attention: the coordinate order is fixed: Longitude first, followed by Latitude, optionally height above MSL (EPSG:5714) as third coordinate.", "title": "GeoJSON FeatureCollection", "type": "object", "required": [ "type", "features" ], "properties": { "type": { "type": "string", "enum": [ "FeatureCollection" ] }, "features": { "type": "array", "items": { "title": "GeoJSON Feature", "type": "object", "required": [ "type", "properties", "geometry" ], "properties": { "geometry": { "oneOf": [ { "type": "null" }, { "title": "GeoJSON Point", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "minItems": 2, "type": "array", "items": { "type": "number" } }, "type": { "type": "string", "enum": [ "Point" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "GeoJSON LineString", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "minItems": 2, "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } }, "type": { "type": "string", "enum": [ "LineString" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "GeoJSON Polygon", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "type": "array", "items": { "minItems": 4, "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } } }, "type": { "type": "string", "enum": [ "Polygon" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "GeoJSON MultiPoint", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } }, "type": { "type": "string", "enum": [ "MultiPoint" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "GeoJSON MultiLineString", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "type": "array", "items": { "minItems": 2, "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } } }, "type": { "type": "string", "enum": [ "MultiLineString" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "GeoJSON MultiPolygon", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "type": "array", "items": { "type": "array", "items": { "minItems": 4, "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } } } }, "type": { "type": "string", "enum": [ "MultiPolygon" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "GeoJSON GeometryCollection", "type": "object", "required": [ "type", "geometries" ], "properties": { "type": { "type": "string", "enum": [ "GeometryCollection" ] }, "geometries": { "type": "array", "items": { "oneOf": [ { "title": "GeoJSON Point", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "minItems": 2, "type": "array", "items": { "type": "number" } }, "type": { "type": "string", "enum": [ "Point" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "GeoJSON LineString", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "minItems": 2, "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } }, "type": { "type": "string", "enum": [ "LineString" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "GeoJSON Polygon", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "type": "array", "items": { "minItems": 4, "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } } }, "type": { "type": "string", "enum": [ "Polygon" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "GeoJSON MultiPoint", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } }, "type": { "type": "string", "enum": [ "MultiPoint" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "GeoJSON MultiLineString", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "type": "array", "items": { "minItems": 2, "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } } }, "type": { "type": "string", "enum": [ "MultiLineString" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "GeoJSON MultiPolygon", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "type": "array", "items": { "type": "array", "items": { "minItems": 4, "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } } } }, "type": { "type": "string", "enum": [ "MultiPolygon" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } } ] } }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } } ] }, "type": { "type": "string", "enum": [ "Feature" ] }, "properties": { "oneOf": [ { "type": "null" }, { "type": "object" } ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } } }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } }, "$id": "https://schema.osdu.opengroup.org/json/abstract/AbstractFeatureCollection.1.0.0.json" }, "legal": { "description": "Legal meta data like legal tags, relevant other countries, legal status.", "title": "Legal Meta Data", "type": "object", "properties": { "legaltags": { "description": "The list of legal tags, see compliance API.", "title": "Legal Tags", "type": "array", "items": { "type": "string" } }, "otherRelevantDataCountries": { "description": "The list of other relevant data countries using the ISO 2-letter codes, see compliance API.", "title": "Other Relevant Data Countries", "type": "array", "items": { "type": "string" } }, "status": { "title": "Legal Status", "type": "string", "description": "The legal status." } } }, "osdu:wks:AbstractAnyCrsFeatureCollection:1.0.0": { "x-osdu-inheriting-from-kind": [], "x-osdu-license": "Copyright 2021, The Open Group \\nLicensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 . Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.", "$schema": "http://json-schema.org/draft-07/schema#", "x-osdu-schema-source": "osdu:wks:AbstractAnyCrsFeatureCollection:1.0.0", "description": "A schema like GeoJSON FeatureCollection with a non-WGS 84 CRS context; based on https://geojson.org/schema/FeatureCollection.json. Attention: the coordinate order is fixed: Longitude/Easting/Westing/X first, followed by Latitude/Northing/Southing/Y, optionally height as third coordinate.", "title": "AbstractAnyCrsFeatureCollection", "type": "object", "required": [ "type", "persistableReferenceCrs", "features" ], "properties": { "CoordinateReferenceSystemID": { "pattern": "^[\\w\\-\\.]+:reference-data\\-\\-CoordinateReferenceSystem:[\\w\\-\\.\\:\\%]+:[0-9]*$", "description": "The CRS reference into the CoordinateReferenceSystem catalog.", "x-osdu-relationship": [ { "EntityType": "CoordinateReferenceSystem", "GroupType": "reference-data" } ], "title": "Coordinate Reference System ID", "type": "string", "example": "namespace:reference-data--CoordinateReferenceSystem:BoundCRS.SLB.32021.15851:" }, "persistableReferenceCrs": { "description": "The CRS reference as persistableReference string. If populated, the CoordinateReferenceSystemID takes precedence.", "type": "string", "title": "CRS Reference", "example": "{\"lateBoundCRS\":{\"wkt\":\"PROJCS[\\\"NAD_1927_StatePlane_North_Dakota_South_FIPS_3302\\\",GEOGCS[\\\"GCS_North_American_1927\\\",DATUM[\\\"D_North_American_1927\\\",SPHEROID[\\\"Clarke_1866\\\",6378206.4,294.9786982]],PRIMEM[\\\"Greenwich\\\",0.0],UNIT[\\\"Degree\\\",0.0174532925199433]],PROJECTION[\\\"Lambert_Conformal_Conic\\\"],PARAMETER[\\\"False_Easting\\\",2000000.0],PARAMETER[\\\"False_Northing\\\",0.0],PARAMETER[\\\"Central_Meridian\\\",-100.5],PARAMETER[\\\"Standard_Parallel_1\\\",46.1833333333333],PARAMETER[\\\"Standard_Parallel_2\\\",47.4833333333333],PARAMETER[\\\"Latitude_Of_Origin\\\",45.6666666666667],UNIT[\\\"Foot_US\\\",0.304800609601219],AUTHORITY[\\\"EPSG\\\",32021]]\",\"ver\":\"PE_10_3_1\",\"name\":\"NAD_1927_StatePlane_North_Dakota_South_FIPS_3302\",\"authCode\":{\"auth\":\"EPSG\",\"code\":\"32021\"},\"type\":\"LBC\"},\"singleCT\":{\"wkt\":\"GEOGTRAN[\\\"NAD_1927_To_WGS_1984_79_CONUS\\\",GEOGCS[\\\"GCS_North_American_1927\\\",DATUM[\\\"D_North_American_1927\\\",SPHEROID[\\\"Clarke_1866\\\",6378206.4,294.9786982]],PRIMEM[\\\"Greenwich\\\",0.0],UNIT[\\\"Degree\\\",0.0174532925199433]],GEOGCS[\\\"GCS_WGS_1984\\\",DATUM[\\\"D_WGS_1984\\\",SPHEROID[\\\"WGS_1984\\\",6378137.0,298.257223563]],PRIMEM[\\\"Greenwich\\\",0.0],UNIT[\\\"Degree\\\",0.0174532925199433]],METHOD[\\\"NADCON\\\"],PARAMETER[\\\"Dataset_conus\\\",0.0],AUTHORITY[\\\"EPSG\\\",15851]]\",\"ver\":\"PE_10_3_1\",\"name\":\"NAD_1927_To_WGS_1984_79_CONUS\",\"authCode\":{\"auth\":\"EPSG\",\"code\":\"15851\"},\"type\":\"ST\"},\"ver\":\"PE_10_3_1\",\"name\":\"NAD27 * OGP-Usa Conus / North Dakota South [32021,15851]\",\"authCode\":{\"auth\":\"SLB\",\"code\":\"32021079\"},\"type\":\"EBC\"}" }, "features": { "type": "array", "items": { "title": "AnyCrsGeoJSON Feature", "type": "object", "required": [ "type", "properties", "geometry" ], "properties": { "geometry": { "oneOf": [ { "type": "null" }, { "title": "AnyCrsGeoJSON Point", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "minItems": 2, "type": "array", "items": { "type": "number" } }, "type": { "type": "string", "enum": [ "AnyCrsPoint" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "AnyCrsGeoJSON LineString", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "minItems": 2, "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } }, "type": { "type": "string", "enum": [ "AnyCrsLineString" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "AnyCrsGeoJSON Polygon", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "type": "array", "items": { "minItems": 4, "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } } }, "type": { "type": "string", "enum": [ "AnyCrsPolygon" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "AnyCrsGeoJSON MultiPoint", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } }, "type": { "type": "string", "enum": [ "AnyCrsMultiPoint" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "AnyCrsGeoJSON MultiLineString", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "type": "array", "items": { "minItems": 2, "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } } }, "type": { "type": "string", "enum": [ "AnyCrsMultiLineString" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "AnyCrsGeoJSON MultiPolygon", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "type": "array", "items": { "type": "array", "items": { "minItems": 4, "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } } } }, "type": { "type": "string", "enum": [ "AnyCrsMultiPolygon" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "AnyCrsGeoJSON GeometryCollection", "type": "object", "required": [ "type", "geometries" ], "properties": { "type": { "type": "string", "enum": [ "AnyCrsGeometryCollection" ] }, "geometries": { "type": "array", "items": { "oneOf": [ { "title": "AnyCrsGeoJSON Point", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "minItems": 2, "type": "array", "items": { "type": "number" } }, "type": { "type": "string", "enum": [ "AnyCrsPoint" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "AnyCrsGeoJSON LineString", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "minItems": 2, "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } }, "type": { "type": "string", "enum": [ "AnyCrsLineString" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "AnyCrsGeoJSON Polygon", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "type": "array", "items": { "minItems": 4, "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } } }, "type": { "type": "string", "enum": [ "AnyCrsPolygon" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "AnyCrsGeoJSON MultiPoint", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } }, "type": { "type": "string", "enum": [ "AnyCrsMultiPoint" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "AnyCrsGeoJSON MultiLineString", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "type": "array", "items": { "minItems": 2, "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } } }, "type": { "type": "string", "enum": [ "AnyCrsMultiLineString" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } }, { "title": "AnyCrsGeoJSON MultiPolygon", "type": "object", "required": [ "type", "coordinates" ], "properties": { "coordinates": { "type": "array", "items": { "type": "array", "items": { "minItems": 4, "type": "array", "items": { "minItems": 2, "type": "array", "items": { "type": "number" } } } } }, "type": { "type": "string", "enum": [ "AnyCrsMultiPolygon" ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } } ] } }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } } ] }, "type": { "type": "string", "enum": [ "AnyCrsFeature" ] }, "properties": { "oneOf": [ { "type": "null" }, { "type": "object" } ] }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } } } } }, "persistableReferenceUnitZ": { "description": "The unit of measure for the Z-axis (only for 3-dimensional coordinates, where the CRS does not describe the vertical unit). Note that the direction is upwards positive, i.e. Z means height.", "type": "string", "title": "Z-Unit Reference", "example": "{\"scaleOffset\":{\"scale\":1.0,\"offset\":0.0},\"symbol\":\"m\",\"baseMeasurement\":{\"ancestry\":\"Length\",\"type\":\"UM\"},\"type\":\"USO\"}" }, "bbox": { "minItems": 4, "type": "array", "items": { "type": "number" } }, "persistableReferenceVerticalCrs": { "description": "The VerticalCRS reference as persistableReference string. If populated, the VerticalCoordinateReferenceSystemID takes precedence. The property is null or empty for 2D geometries. For 3D geometries and absent or null persistableReferenceVerticalCrs the vertical CRS is either provided via persistableReferenceCrs's CompoundCRS or it is implicitly defined as EPSG:5714 MSL height.", "type": "string", "title": "Vertical CRS Reference", "example": "{\"authCode\":{\"auth\":\"EPSG\",\"code\":\"5773\"},\"type\":\"LBC\",\"ver\":\"PE_10_3_1\",\"name\":\"EGM96_Geoid\",\"wkt\":\"VERTCS[\\\"EGM96_Geoid\\\",VDATUM[\\\"EGM96_Geoid\\\"],PARAMETER[\\\"Vertical_Shift\\\",0.0],PARAMETER[\\\"Direction\\\",1.0],UNIT[\\\"Meter\\\",1.0],AUTHORITY[\\\"EPSG\\\",5773]]\"}" }, "type": { "type": "string", "enum": [ "AnyCrsFeatureCollection" ] }, "VerticalCoordinateReferenceSystemID": { "pattern": "^[\\w\\-\\.]+:reference-data\\-\\-CoordinateReferenceSystem:[\\w\\-\\.\\:\\%]+:[0-9]*$", "description": "The explicit VerticalCRS reference into the CoordinateReferenceSystem catalog. This property stays empty for 2D geometries. Absent or empty values for 3D geometries mean the context may be provided by a CompoundCRS in 'CoordinateReferenceSystemID' or implicitly EPSG:5714 MSL height", "x-osdu-relationship": [ { "EntityType": "CoordinateReferenceSystem", "GroupType": "reference-data" } ], "title": "Vertical Coordinate Reference System ID", "type": "string", "example": "namespace:reference-data--CoordinateReferenceSystem:VerticalCRS.EPSG.5773:" } }, "$id": "https://schema.osdu.opengroup.org/json/abstract/AbstractAnyCrsFeatureCollection.1.0.0.json" }, "metaItem": { "description": "A meta data item, which allows the association of named properties or property values to a Unit/Measurement/CRS/Azimuth/Time context.", "title": "Frame of Reference Meta Data Item", "type": "object", "properties": { "name": { "description": "The name of the CRS or the symbol/name of the unit", "title": "Name or Symbol", "type": "string", "example": [ "ftUS", "NAD27 * OGP-Usa Conus / North Dakota South [32021,15851]" ] }, "propertyValues": { "description": "The list of property values, to which this meta data item provides Unit/CRS context to. Typically a unit symbol is a value to a data structure; this symbol is then registered in this propertyValues array and the persistableReference provides the absolute reference.", "title": "Attribute Names", "type": "array", "items": { "type": "string" }, "example": [ "Foot US", "ftUS" ] }, "persistableReference": { "description": "The persistable reference string uniquely identifying the CRS or Unit", "title": "Persistable Reference", "type": "string", "example": "{\"scaleOffset\":{\"scale\":0.3048006096012192,\"offset\":0.0},\"symbol\":\"ftUS\",\"baseMeasurement\":{\"ancestry\":\"Length\",\"type\":\"UM\"},\"type\":\"USO\"}" }, "uncertainty": { "title": "Uncertainty", "type": "number", "description": "The uncertainty of the values measured given the unit or CRS unit." }, "kind": { "description": "The kind of reference, unit, measurement, CRS or azimuth reference.", "title": "Reference Kind", "type": "string", "example": [ "Unit", "CRS", "Measurement", "AzimuthReference" ] }, "propertyNames": { "description": "The list of property names, to which this meta data item provides Unit/CRS context to. Data structures, which come in a single frame of reference, can register the property name, others require a full path like \"data.structureA.propertyB\" to define a unique context.", "title": "Attribute Names", "type": "array", "items": { "type": "string" }, "example": [ "elevationFromMsl", "totalDepthMdDriller", "wellHeadProjected" ] } }, "required": [ "kind", "persistableReference" ] }, "location": { "type": "object", "properties": { "epsg_code": { "description": "EPSG code of the CRS", "title": "EPSG Code", "type": "number", "example": "4326" }, "type": { "description": "Wellbore location type", "title": "Location Type", "type": "string", "example": "Surface Location" }, "crs": { "description": "Wellbore location CRS", "title": "CRS", "type": "string", "example": "World Geodetic System 1984" }, "latitude": { "description": "latitude", "type": "number", "title": "latitude", "example": [ 30 ] }, "longitude": { "description": "longitude", "type": "number", "title": "longitude", "example": [ 30 ] } } }, "parentProjectIds": { "type": "object", "description": "Project IDs", "properties": { "projectId": { "description": "The project ID reference.", "x-osdu-relationship": [ { "EntityType": "project", "GroupType": "master-data" } ], "type": "array", "items": { "type": "string" } } } }, "tags": { "description": "A generic dictionary of string keys mapping to string value. Only strings are permitted as keys and values.", "additionalProperties": { "type": "string" }, "title": "Tag Dictionary", "type": "object", "example": { "NameOfKey": "String value" } }, "osdu:wks:AbstractSpatialLocation:1.0.0": { "x-osdu-inheriting-from-kind": [], "x-osdu-license": "Copyright 2021, The Open Group \\nLicensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 . Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.", "$schema": "http://json-schema.org/draft-07/schema#", "x-osdu-schema-source": "osdu:wks:AbstractSpatialLocation:1.0.0", "description": "A geographic object which can be described by a set of points.", "title": "AbstractSpatialLocation", "type": "object", "properties": { "AsIngestedCoordinates": { "description": "The original or 'as ingested' coordinates (Point, MultiPoint, LineString, MultiLineString, Polygon or MultiPolygon). The name 'AsIngestedCoordinates' was chosen to contrast it to 'OriginalCoordinates', which carries the uncertainty whether any coordinate operations took place before ingestion. In cases where the original CRS is different from the as-ingested CRS, the OperationsApplied can also contain the list of operations applied to the coordinate prior to ingestion. The data structure is similar to GeoJSON FeatureCollection, however in a CRS context explicitly defined within the AbstractAnyCrsFeatureCollection. The coordinate sequence follows GeoJSON standard, i.e. 'eastward/longitude', 'northward/latitude' {, 'upward/height' unless overridden by an explicit direction in the AsIngestedCoordinates.VerticalCoordinateReferenceSystemID}.", "x-osdu-frame-of-reference": "CRS:", "title": "As Ingested Coordinates", "$ref": "#/definitions/osdu:wks:AbstractAnyCrsFeatureCollection:1.0.0" }, "SpatialParameterTypeID": { "pattern": "^[\\w\\-\\.]+:reference-data\\-\\-SpatialParameterType:[\\w\\-\\.\\:\\%]+:[0-9]*$", "description": "A type of spatial representation of an object, often general (e.g. an Outline, which could be applied to Field, Reservoir, Facility, etc.) or sometimes specific (e.g. Onshore Outline, State Offshore Outline, Federal Offshore Outline, 3 spatial representations that may be used by Countries).", "x-osdu-relationship": [ { "EntityType": "SpatialParameterType", "GroupType": "reference-data" } ], "type": "string" }, "QuantitativeAccuracyBandID": { "pattern": "^[\\w\\-\\.]+:reference-data\\-\\-QuantitativeAccuracyBand:[\\w\\-\\.\\:\\%]+:[0-9]*$", "description": "An approximate quantitative assessment of the quality of a location (accurate to > 500 m (i.e. not very accurate)), to < 1 m, etc.", "x-osdu-relationship": [ { "EntityType": "QuantitativeAccuracyBand", "GroupType": "reference-data" } ], "type": "string" }, "CoordinateQualityCheckRemarks": { "type": "array", "description": "Freetext remarks on Quality Check.", "items": { "type": "string" } }, "AppliedOperations": { "description": "The audit trail of operations applied to the coordinates from the original state to the current state. The list may contain operations applied prior to ingestion as well as the operations applied to produce the Wgs84Coordinates. The text elements refer to ESRI style CRS and Transformation names, which may have to be translated to EPSG standard names.", "title": "Operations Applied", "type": "array", "items": { "type": "string" }, "example": [ "conversion from ED_1950_UTM_Zone_31N to GCS_European_1950; 1 points converted", "transformation GCS_European_1950 to GCS_WGS_1984 using ED_1950_To_WGS_1984_24; 1 points successfully transformed" ] }, "QualitativeSpatialAccuracyTypeID": { "pattern": "^[\\w\\-\\.]+:reference-data\\-\\-QualitativeSpatialAccuracyType:[\\w\\-\\.\\:\\%]+:[0-9]*$", "description": "A qualitative description of the quality of a spatial location, e.g. unverifiable, not verified, basic validation.", "x-osdu-relationship": [ { "EntityType": "QualitativeSpatialAccuracyType", "GroupType": "reference-data" } ], "type": "string" }, "CoordinateQualityCheckPerformedBy": { "type": "string", "description": "The user who performed the Quality Check." }, "SpatialLocationCoordinatesDate": { "format": "date-time", "description": "Date when coordinates were measured or retrieved.", "x-osdu-frame-of-reference": "DateTime", "type": "string" }, "CoordinateQualityCheckDateTime": { "format": "date-time", "description": "The date of the Quality Check.", "x-osdu-frame-of-reference": "DateTime", "type": "string" }, "Wgs84Coordinates": { "title": "WGS 84 Coordinates", "description": "The normalized coordinates (Point, MultiPoint, LineString, MultiLineString, Polygon or MultiPolygon) based on WGS 84 (EPSG:4326 for 2-dimensional coordinates, EPSG:4326 + EPSG:5714 (MSL) for 3-dimensional coordinates). This derived coordinate representation is intended for global discoverability only. The schema of this substructure is identical to the GeoJSON FeatureCollection https://geojson.org/schema/FeatureCollection.json. The coordinate sequence follows GeoJSON standard, i.e. longitude, latitude {, height}", "$ref": "#/definitions/osdu:wks:AbstractFeatureCollection:1.0.0" }, "SpatialGeometryTypeID": { "pattern": "^[\\w\\-\\.]+:reference-data\\-\\-SpatialGeometryType:[\\w\\-\\.\\:\\%]+:[0-9]*$", "description": "Indicates the expected look of the SpatialParameterType, e.g. Point, MultiPoint, LineString, MultiLineString, Polygon, MultiPolygon. The value constrains the type of geometries in the GeoJSON Wgs84Coordinates and AsIngestedCoordinates.", "x-osdu-relationship": [ { "EntityType": "SpatialGeometryType", "GroupType": "reference-data" } ], "type": "string" } }, "$id": "https://schema.osdu.opengroup.org/json/abstract/AbstractSpatialLocation.1.0.0.json" } }, "properties": { "data": { "description": "Field data container", "title": "Field Data", "$ref": "#/definitions/FieldData" }, "kind": { "default": "opendes:test:Field:1.0.0", "description": "OSDU demo wellbore kind specification", "title": "Wellbore Kind", "type": "string" }, "meta": { "description": "The meta data section linking the 'unitKey', 'crsKey' to self-contained definitions (persistableReference)", "title": "Frame of Reference Meta Data", "type": "array", "items": { "$ref": "#/definitions/metaItem" } }, "legal": { "description": "The geological interpretation's legal tags", "title": "Legal Tags", "$ref": "#/definitions/legal" }, "acl": { "description": "The access control tags associated with this entity.", "title": "Access Control List", "$ref": "#/definitions/accessControlList" }, "id": { "description": "The unique identifier of the field", "title": "feild ID", "type": "string" }, "type": { "description": "The reference entity type as declared in common:metadata:entity:*.", "title": "Entity Type", "type": "string" }, "version": { "format": "int64", "description": "The version number of this wellbore; set by the framework.", "title": "Entity Version Number", "type": "number", "example": "1040815391631285" }, "tags": { "description": "The links to data, which constitute the inputs.", "title": "tags", "type": "array", "items": { "$ref": "#/definitions/tags" } } } }Sample file of schema for sheet wellbore: Wellbore_Schema.json
{ "license": "Copyright 2017-2020, Schlumberger\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\nhttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n", "$schema": "http://json-schema.org/draft-07/schema#", "description": "The wellbore schema. Used to capture the general information about a wellbore. This information is sometimes called a \"wellbore header\". A wellbore represents the path from surface to a unique bottomhole location. The wellbore object is uniquely identified within the context of one well object.", "id": "https://slb-swt.visualstudio.com/data-management/Ingestion%20Services/_git/wke-schema?path=%2Fdomains%2Fwell%2Fjson_schema%2Fslb_wke_wellbore.json&version=GBmaster", "title": "Wellbore", "type": "object", "definitions": { "legal": { "$schema": "http://json-schema.org/draft-07/schema#", "title": "legal", "type": "object" }, "metaItem": { "$schema": "http://json-schema.org/draft-07/schema#", "title": "metaItem", "type": "object" }, "tagDictionary": { "$schema": "http://json-schema.org/draft-07/schema#", "title": "tagDictionary", "type": "object" }, "linkList": { "type": "object", "properties": { "name": { "link": "string" } } }, "wellboreData": { "description": "The domain specific data container for a wellbore.", "title": "Wellbore Data", "type": "object", "properties": { "SPUD_DATE": { "format": "date", "description": "The date and time when activities to drill the borehole begin to create a hole in the earth. For a sidetrack, this is the date kickoff operations began. The format follows ISO 8601 YYYY-MM-DD extended format", "x-slb-aliasProperties": [ "witsml:DTimKickoff", "ocean:SPUD_DATE", "drillplan:spud_date" ], "title": "Spud Date", "type": "string", "example": "2013-03-22" }, "TVD": { "x-slb-measurement": "True Vertical Depth", "description": "TBD", "x-slb-aliasProperties": [ "TBD:TBD" ], "type": "string", "title": "True Vertical Depth", "example": [ 20711, "TBD" ] }, "PERMIT_NUMBER": { "description": "Ther permit number for the wellbore", "x-slb-aliasProperties": [ "TBD:TBD" ], "title": "Permit Number", "type": "string", "example": "SMP-09995" }, "WELLBORE_NAME": { "description": "TBD", "x-slb-aliasProperties": [ "TBD:TBD" ], "title": "Wellbore Name", "type": "string", "example": "SMP G09995 001S0B1" }, "CRS": { "description": "Wellbore location CRS", "x-slb-aliasProperties": [ "TBD:TBD" ], "title": "CRS", "type": "string", "example": "World Geodetic System 1984" }, "LONGUITUDE": { "x-slb-measurement": "Longuitude", "description": "TBD", "x-slb-aliasProperties": [ "TBD:TBD" ], "type": "number", "title": "Longuitude", "example": [ -119.2, "TBD" ] }, "STATE": { "description": "The state, in which the wellbore is located.", "x-slb-aliasProperties": [ "witsml:State" ], "title": "State", "type": "string", "example": [ "Texas" ] }, "CLASS": { "description": "The current class of the wellbore", "x-slb-aliasProperties": [ "TBD:TBD" ], "title": "class", "type": "string", "example": "NEW FIELD WILDCAT" }, "WELLBORE_SHAPE": { "description": "The shape of the wellbore", "x-slb-aliasProperties": [ "TBD:TBD" ], "title": "Wellbore Shape", "type": "string", "example": [ "DIRECTIONAL", "VERTICAL" ] }, "FORMATION_AT_TD": { "description": "The formation name at the wellbore total depth", "x-slb-aliasProperties": [ "witsml:FORMATION_AT_TD" ], "title": "Formation at TD", "type": "string", "example": "MIOCENE LOWER" }, "PERMIT_DATE": { "format": "date", "description": "The date and time when the wellbore permit was issued. The format follows ISO 8601 YYYY-MM-DD extended format", "x-slb-aliasProperties": [ "witsml:DTimKickoff", "ocean:PERMIT_DATE", "drillplan:PERMIT_DATE" ], "title": "Permit Date", "type": "string", "example": "2013-03-22" }, "STATUS": { "description": "The current status of the wellbore", "x-slb-aliasProperties": [ "TBD:TBD" ], "title": "Status", "type": "string", "example": "DRY & ABANDONED" }, "COUNTRY": { "description": "The country, in which the wellbore is located. The country name follows the convention in ISO 3166-1 'English short country name', see https://en.wikipedia.org/wiki/ISO_3166-1", "x-slb-aliasProperties": [ "witsml:Country" ], "title": "Country", "type": "string", "example": [ "United States of America" ] }, "WB_NUMBER": { "description": "TBD", "x-slb-aliasProperties": [ "TBD:TBD" ], "title": "Wellbore Number", "type": "string", "example": "001S0B1" }, "MD": { "x-slb-measurement": "Measured Depth", "description": "TBD", "x-slb-aliasProperties": [ "TBD:TBD" ], "title": "Measured Depth", "type": "string", "example": "12.20" }, "ORIGINAL_OPERATOR": { "description": "The original operator of the wellbore.", "x-slb-aliasProperties": [ "ocean:Operator", "witsml:Operator" ], "title": "Original Operator", "type": "string", "example": "Anadarko Petroleum" }, "BASIN": { "description": "The basin name, to which the wellbore belongs.", "x-slb-aliasProperties": [ "witsml:BASIN" ], "title": "Basin", "type": "string", "example": "ATWATER" }, "EPSG_CODE": { "description": "EPSG code of the CRS", "x-slb-aliasProperties": [ "TBD:TBD" ], "title": "EPSG Code", "type": "string", "example": "4326" }, "COUNTY": { "description": "The county, in which the wellbore is located.", "x-slb-aliasProperties": [ "witsml:County" ], "title": "County", "type": "string", "example": [ "ATWATER VALLEY" ] }, "UNIT_SYSTEM": { "description": "Unit system used for the wellbore measurements", "x-slb-aliasProperties": [ "TBD:TBD" ], "title": "Unit Sustem", "type": "string", "example": "English" }, "UWI": { "description": "The unique wellbore identifier, aka. API number, US well number or UBHI. Codes can have 10, 12 or 14 digits depending on the availability of directional sidetrack (2 digits) and event sequence codes (2 digits).", "x-slb-aliasProperties": [ "ocean:UWI", "witsml:SuffixAPI", "drillplan:uwi" ], "title": "Unique Wellbore Identifier", "type": "string", "x-osdu-natural-key": 1, "example": [ "SP435844921288", "42-501-20130-01-02" ] }, "FIELD": { "description": "The field name, to which the wellbore belongs.", "x-slb-aliasProperties": [ "witsml:Field" ], "title": "Field", "type": "string", "example": "ATWATER VLLY B 8" }, "INITIAL_COMPLETION_DATE": { "format": "date", "description": "The date and time of the initial completion of the wellbore. The format follows ISO 8601 YYYY-MM-DD extended format", "x-slb-aliasProperties": [ "witsml:DTimKickoff", "ocean:INITIAL_COMPLETION_DATE", "drillplan:INITIAL_COMPLETION_DATE" ], "title": "Initial Completion Date", "type": "string", "example": "2013-03-22" }, "ELEVATION": { "x-slb-measurement": "Elevation", "description": "TBD", "x-slb-aliasProperties": [ "TBD:TBD" ], "title": "Elevation", "type": "string", "example": [ 84, "TBD" ] }, "STATUS_DATE": { "format": "date", "description": "The date and time of the current status of the wellbore. The format follows ISO 8601 YYYY-MM-DD extended format", "x-slb-aliasProperties": [ "witsml:DTimKickoff", "ocean:STATUS_DATE", "drillplan:STATUS_DATE" ], "title": "Status Date", "type": "string", "example": "2013-03-22" }, "OPERATOR": { "description": "The operator of the wellbore.", "x-slb-aliasProperties": [ "ocean:Operator", "witsml:Operator" ], "title": "Operator", "type": "string", "example": "Anadarko Petroleum" }, "LEASE": { "description": "The lease name, to which the wellbore belongs.", "x-slb-aliasProperties": [ "witsml:LEASE" ], "title": "LEASE", "type": "string", "example": "SMP G09995" }, "LATITUDE": { "x-slb-measurement": "Latitude", "description": "TBD", "x-slb-aliasProperties": [ "TBD:TBD" ], "type": "number", "title": "Latitude", "example": [ 60.2, "TBD" ] }, "ELEVATION_REF": { "description": "Elevation reference used for the measurements", "x-slb-aliasProperties": [ "TBD:TBD" ], "title": "Elevation reference", "type": "string", "example": "MSL" } }, "$id": "definitions/wellboreData" } }, "properties": { "ancestry": { "description": "The links to data, which constitute the inputs.", "title": "Ancestry", "$ref": "#/definitions/linkList" }, "data": { "description": "Wellbore data container", "title": "Wellbore Data", "$ref": "#/definitions/wellboreData" }, "kind": { "default": "slb:osdudemo:wellbore:1.0.0", "description": "OSDU demo wellbore kind specification", "title": "Wellbore Kind", "type": "string" }, "meta": { "description": "The meta data section linking the 'unitKey', 'crsKey' to self-contained definitions (persistableReference)", "title": "Frame of Reference Meta Data", "type": "array", "items": { "$ref": "#/definitions/metaItem" } }, "legal": { "description": "The geological interpretation's legal tags", "title": "Legal Tags", "$ref": "#/definitions/legal" }, "acl": { "description": "The access control tags associated with this entity.", "title": "Access Control List", "$ref": "#/definitions/tagDictionary" }, "id": { "description": "The unique identifier of the wellbore", "title": "Wellbore ID", "type": "string" }, "type": { "description": "The reference entity type as declared in common:metadata:entity:*.", "title": "Entity Type", "type": "string" }, "version": { "format": "int64", "description": "The version number of this wellbore; set by the framework.", "title": "Entity Version Number", "type": "number", "example": "1040815391631285" } } }- Use the Schema service to map the attributes and their kinds for each sheet of the Excel file. The Schema kind should be unique; construct it using the attributes under schemaIdentity. The basic format is
The file metadata is used to provide the ACL, legal tags, file source, and target kind of the ingested data, attributes and their kinds, metadata information like unit, CRS, and establish relationships with existing Parents.
Sample metadata:
Sample_Excel_File_metadata.json
{
"data": {
"Endian": "BIG",
"Description": "string",
"DatasetProperties": {
"FileSourceInfo": {
"FileSource": "/osdu-user/1640852899353-2021-12-30-08-28-19-353/723fb105ee84494999e1a3dfaa4004e8",
"PreLoadFilePath": "string",
"PreloadFileCreateUser": "string",
"PreloadFileModifyUser": "string",
"PreloadFileModifyDate": "string",
"Name": "NZ-Amokura_Sample.xlsx",
"FileSize": "string",
"EncodingFormatTypeID": "string"
}
},
"ResourceLifecycleStatus": "string",
"TotalSize": "string",
"ResourceCurationStatus": "string",
"EncodingFormatTypeID": "string",
"Source": "string",
"Name": "NZ-Amokura.xlsx",
"ResourceHomeRegionID": "string",
"ResourceHostRegionIDs": [
"string"
],
"ExtensionProperties": {
"Classification": "string",
"Description": "string",
"ExternalIds": [
"string"
],
"FileDateCreated": {},
"FileDateModified": {},
"FileContentsDetails": {
"excelSheetsMetadata": [
{
"excelSheetName": "field",
"TargetKind": "<authority>:test:field:1.0.0",
"nestedFieldDelimiter": ".",
"FrameOfReference": [
{
"kind": "CRS",
"name": "GCS_WGS_1984",
"persistableReference": "{\"wkt\":\"GEOGCS[\\\"GCS_WGS_1984\\\",DATUM[\\\"D_WGS_1984\\\",SPHEROID[\\\"WGS_1984\\\",6378137.0,298.257223563]],PRIMEM[\\\"Greenwich\\\",0.0],UNIT[\\\"Degree\\\",0.0174532925199433],AUTHORITY[\\\"EPSG\\\",4326]]\",\"ver\":\"PE_10_3_1\",\"name\":\"GCS_WGS_1984\",\"authCode\":{\"auth\":\"EPSG\",\"code\":\"4326\"},\"type\":\"LBC\"}",
"propertyNames": [
"Location Id"
],
"propertyValues": [
"deg"
],
"uncertainty": 0
},
{
"kind": "DateTime",
"persistableReference": "{\"type\": \"DAT\", \"format\": \"MM-dd-yyyy\"}",
"name": "date",
"propertyNames": [
"Discovery Date"
],
"propertyValues": [
"Discovery Date"
],
"uncertainty": 0
}
],
"relationships": [
{
"project": {
"ids": [
"<authority>:testSource:project-sxs0f1a5219-4640-50af-9f63:",
"<authority>:testSource:project-dsv05f45665-5885-40ad-d9m3:"
]
}
}
],
"acl": {
"viewers": [
"data.default.viewers@{domain}.com"
],
"owners": [
"data.default.viewers@{domain}.com"
]
},
"legal": {
"legaltags": [
"<valid legal tag>"
],
"otherRelevantDataCountries": [
"US"
],
"status": "compliant"
}
},
{
"excelSheetName": "wellbore",
"TargetKind": "<authority>:test:wellbore:1.0.0",
"nestedFieldDelimiter": ".",
"SpatialMapping": {
"type": "point",
"latitude": "LATITUDE",
"longitude": "LONGUITUDE"
},
"FrameOfReference": [
{
"kind": "CRS",
"name": "GCS_WGS_1984",
"persistableReference": "{\"wkt\":\"GEOGCS[\\\"GCS_WGS_1984\\\",DATUM[\\\"D_WGS_1984\\\",SPHEROID[\\\"WGS_1984\\\",6378137.0,298.257223563]],PRIMEM[\\\"Greenwich\\\",0.0],UNIT[\\\"Degree\\\",0.0174532925199433],AUTHORITY[\\\"EPSG\\\",4326]]\",\"ver\":\"PE_10_3_1\",\"name\":\"GCS_WGS_1984\",\"authCode\":{\"auth\":\"EPSG\",\"code\":\"4326\"},\"type\":\"LBC\"}",
"propertyNames": [
"LATITUDE",
"LONGUITUDE"
],
"propertyValues": [
"deg"
],
"uncertainty": 0
},
{
"kind": "DateTime",
"persistableReference": "{\"type\": \"DAT\", \"format\": \"MM-dd-yyyy\"}",
"name": "date",
"propertyNames": [
"STATUS_DATE",
"SPUD_DATE"
],
"propertyValues": [
"STATUS_DATE",
"SPUD_DATE"
],
"uncertainty": 0
},
{
"kind": "Unit",
"name": "ft",
"persistableReference": "{\"scaleOffset\":{\"scale\":0.3048,\"offset\":0.0},\"symbol\":\"ft\",\"baseMeasurement\":{\"ancestry\":\"Length\",\"type\":\"UM\"},\"type\":\"USO\"}",
"propertyNames": [
"md",
"tvd",
"elevation"
],
"propertyValues": [
"ft"
],
"uncertainty": 0
}
],
"relationships": [
{
"project": {
"ids": [
"<authority>:testSource:project-sxs0f1a5219-4640-50af-9f63:",
"<authority>:testSource:project-dsv05f45665-5885-40ad-d9m3:"
]
}
}
],
"relatedNaturalKey": [
{
"well": {
"targetKind": "<authority>:testSource:well:1.0.0",
"keys": [
{
"sourceColumn": "UBHI",
"targetAttribute": "uwi"
}
]
}
}
]
}
]
}
},
"ResourceSecurityClassification": "string",
"ExistenceKind": "string",
"SchemaFormatTypeID": "string"
},
"meta": [
{}
],
"id": "<authority>:dataset--File.Generic:acf7cf1f-e396-4075-a381-bca3117c5eab",
"version": 1640855792471113,
"kind": "<authority>:wks:dataset--File.Generic:1.0.0",
"acl": {
"viewers": [
"data.default.viewers@{domain}.com"
],
"owners": [
"data.default.owners@{domain}.com"
]
},
"legal": {
"legaltags": [
"<valid legal tag>"
],
"otherRelevantDataCountries": [
"US"
],
"status": "compliant"
}
}ACL:
Specifies the Access Control list for the Excel file.
Example:
"acl": { "viewers": [ "data.default.viewers@{domain}.com" ], "owners": [ "data.default.viewers@{domain}.com" ] }
Legal Tags:
- Specifies the legal tags for the Excel file. This holds the details of the data compliance for the data ingested.
- API Reference
- Example:
"legal": { "legaltags": [ "opendes-public-usa-dataset-7643990" ], "otherRelevantDataCountries": [ "US" ], "status": "compliant" }
File source:
- This is the relative file path of the Excel file uploaded in the file-staging-area.
- This value is the relative file path that you received from the
Get Signed Urlresponse (mentioned above in Step #2)
Metadata for each sheet of Excel
This information is provided under
FileContentsDetails.excelSheetsMetadataand is used to support the ingestion of the data present in each Excel sheet.Target kind/Schema kind:
- This is the Schema kind in which data is ingested for each Excel sheet in the following format
<<authority>:<source>:<entityType>:<version>>
- This is the Schema kind in which data is ingested for each Excel sheet in the following format
Nested/Nested Array schema
To support the ingestion of data into nested/nested array attributes, the headers of the uploaded Excel header must match the nested/nested array attributes of the target schemas, using the delimiter characters defined in the metadata file.
The
nestedFieldDelimiterattribute in the file metadata is used to define which character is used in the Excel file header to describe the different levels of nested/nested array attributes while the ingestor parses the files.The delimiter character used to define nested/nested array structures in the Excel file header must match the one defined by the
nestedFieldDelimiterin the file metadata record, otherwise the attributes in the Excel file will not be considered nested/nested array.{ "ExtensionProperties": { "FileContentsDetails": { "TargetKind": "<<authority>:<source>:<entityType>:<version>>", "nestedFieldDelimiter":".", "FileType": "Excel" } } }Unit
- This value is used for converting the declared frame of reference information into the appropriate persistable reference, as per the OSDU Data Platform Unit service. This information is stored in the meta[] block.
- The ExtensionProperties Block is used to provide content details of the file; Excel ingestion workflow uses this same block for the Unit information.
CRS
- The CRS Frame of Reference information should be included in the schema of the data, including the source CRS (either geographic or projected), and in the case of projected, This CRS info and persistable reference(if provided in schema) information is stored in the meta[] block.
- ExtensionProperties block is used to provide the content details of the file. The Excel ingestion workflow uses this same block for the CRS information.
Relationships
Excel ingestion supports two kinds of relationships:
Deterministic (Schema-driven)
These relationships require that the entity be referenced in the record's targetKind schema under an attribute that has thex-osdu-relationshiptag. Because they are present in the schema, they are represented directly as attributes in thedatablock of the record.- String and array type relationships are supported.
- Pattern matching is also done, if the pattern is present for the matching entity in the schema.
Non Deterministic (Data-driven)
These relationships do not require any mention in the schema. They are represented within thedata.relationshipsblock of the record.
The ExtensionProperties block in the file metadata record is used to provide additional information for ingestion. Use this block to provide relationship information. There are three ways to provide this information:
In the
relationshipsblock, with the entity name and a list of parent record IDs. The IDs provided here are directly used to establish relationships.In the
relatedNaturalKeyblock, as an entity that requires a search of the targetKind using the natural keys provided to establish a relationship.- sourceColumn: Column name of the Excel file which refers to the key parent attribute.
- targetKind: Schema ID of the parent record.
- targetAttribute: The key attribute of the parent record which is used to search the parent record. If the
targetAttributeis a nested attribute, it must be separated by a.period.
Pre-requisites: The Excel file must have the key attributes of the parent records.
In the
relatedNaturalKeyblock, as an entity that has the related parent record ID directly in the Excel file under the sourceColumn of the keys block.- sourceColumn: Column name of the Excel file which refers to the parent record ID.
Pre-requisites: The parent record ID must be present under the sourceColumn of the Excel file.
{ "ExtensionProperties": { "FileContentsDetails": { "TargetKind": "<authority>:<source>:wellbore:1.0.0", "nestedFieldDelimiter": ".", "FileType": "Excel", "relationships": [ { "project": { "ids": [ "<authority>:<source>:project-sxs0f1a5219-4640-50af-9f63:", "<authority>:<source>:project-dsv05f45665-5885-40ad-d9m3:" ] }, "states": { "ids": [ "<authority>:<source>:project-sxs0f1a5219-4640-50af-3r34:", "<authority>:<source>:project-dsv05f45665-5885-40ad-3m63:" ] } } ], "relatedNaturalKey": [ { "well": { "targetKind": "<authority>:<source>:well:1.0.0", "keys": [ { "sourceColumn": "relationshipsToSet.parent_uwi", "targetAttribute": "UWI" } ] } }, { "states": { "targetKind": "<authority>:<source>:states:1.0.0", "keys": [ { "sourceColumn": "state", "targetAttribute": "STATE_NAME" } ] } }, { "trajectory": { "keys": [ { "sourceColumn": "parentDefTrajectory" } ] } }, { "field": { "keys": [ { "sourceColumn": "parentField" } ] } } ] } } }
The schema of the record should have information about attributes that contain deterministic relationships.
The EntityType field within the
x-osdu-relationshipblock should contain the entity that needs to be matched from the ExtensionProperties block.{ "properties": { "wellId": { "x-osdu-relationship": [ { "EntityType": "well", "GroupType": "master-data" } ], "type": "string" }, "defTrajectoryId": { "x-osdu-relationship": [ { "EntityType": "trajectory" } ], "pattern": "^[\\w\\-\\.]+:[\\w\\-\\.]+:[\\w\\-\\.]+:[\\w\\-\\.]*$", "type": "string" }, "fieldId": { "x-osdu-relationship": [ { "EntityType": "field" } ], "pattern": "^[\\w\\-\\.]+:master-data\\-\\-field:[\\w\\-\\.\\:\\%]+:[0-9]*$", "type": "string" }, "projectIds": { "description": "The relationships tags associated with this entity.", "title": "Project IDs", "type": "object", "x-osdu-indexing": { "type": "nested" }, "$ref": "#/definitions/parentProjectIds" }, "parentProjectIds": { "type": "object", "description": "Project IDs", "properties": { "projectId": { "description": "The project ID reference.", "x-osdu-relationship": [ { "EntityType": "project", "GroupType": "master-data" } ], "type": "array", "items": { "type": "string" } } } } } }
The final record will then have the relationships defined as below:
{ "data": { "relationships": { "field": { "ids": [ "<authority>:<source>:field-sxs0f1a5219-4640-50af-9f63-c09140d57c12:" ] }, "states": { "ids": [ "slb-osdu-prod-des-prod-testing:states:external-dGV4YXM" ] } }, "projectIds": { "parentProjectIds": { "projectId": [ "<authority>:<source>:project-sxs0f1a5219-4640-50af-9f63:", "<authority>:<source>:project-dsv05f45665-5885-40ad-d9m3:" ] } }, "defTrajectoryId": "<authority>:<source>:trajectory-sxs0f1a5219-4640-50af-9f63-c09140d57c12:", "wellId": "<authority>:well:external-REQxODI4MTAxMw:" } }
Spatial data
- Pre-requisities:
The target schema must have OSDU spatial location block.
The Excel file has the spatial data attributes.
The ExtensionProperties Block is used to provide content details of the file. The Workflow service uses this same block to provide spatial data information.
SpatialMapping: This section is used to create the spatial data block in the ingested records.
type: This field refers to the type of the spatial data; currently, the Workflow service supports a point only.
latitude: This field refers to the latitude or Y location in case of projection CRS of a point.
longitude: This field refers to the longitude or X location in case of projection CRS of a point.
{ "ExtensionProperties": { "FileContentsDetails": { "TargetKind": "<authority>:<source>:wellbore:1.0.0", "nestedFieldDelimiter": ".", "FileType": "Excel", "SpatialMapping": { "type": "point", "latitude": "location.[0].latitude", "longitude": "location.[0].longitude" }, "FrameOfReference": [ { "kind": "CRS", "name": "GCS_WGS_1984", "persistableReference": "{\"wkt\":\"GEOGCS[\\\"GCS_WGS_1984\\\",DATUM[\\\"D_WGS_1984\\\",SPHEROID[\\\"WGS_1984\\\",6378137.0,298.257223563]],PRIMEM[\\\"Greenwich\\\",0.0],UNIT[\\\"Degree\\\",0.0174532925199433],AUTHORITY[\\\"EPSG\\\",4326]]\",\"ver\":\"PE_10_3_1\",\"name\":\"GCS_WGS_1984\",\"authCode\":{\"auth\":\"EPSG\",\"code\":\"4326\"},\"type\":\"LBC\"}", "propertyNames": [ "location.[0].latitude", "location.[0].longitude", "location.[1].latitude", "location.[1].longitude" ], "propertyValues": [ "deg" ], "uncertainty": 0 } ] } } }
- Pre-requisities:
ACL
Specifies the Access Control list for each Excel sheet. If this information is not provided, the ACL provided in the metadata of the Excel file is used.
Example:
"acl": { "viewers": [ "data.default.viewers@{domain}.com" ], "owners": [ "data.default.viewers@{domain}.com" ] }
Legal tags
- Specifies the legal tags for each Excel sheet. This holds the details of the data compliance for the data ingested. If this information is not provided, the legal tags provided in the metadata of the Excel file are used.
- API Reference
- Example:
"legal": { "legaltags": [ "opendes-public-usa-dataset-7643990" ], "otherRelevantDataCountries": [ "US" ], "status": "compliant" }
The ingestion framework comes with pre-registered DAGs for the Excel ingestion and the ingestion framework's corresponding workflows, which are available to use:
Excel DAG:
- workflow name:
excel_ingestor_wf
OSDU Excel ingestor known limitations:
- Currently, we support Excel files with up to 30 sheets.
- Not yet integrated with OSDU Data Platform Notification service
- Manually call the Workflow service to execute a registered DAG.
- Validation of DAGs and Custom Operators is not yet built. You must make sure that the DAGs and Custom Operators that are passed as the payload are correct as per Airflow constructs.
- Scalability limitations: 90 queued up workflow requests, with 10 concurrent workflows executing.
- Currently, only the point type location values Latitude/Y and Longitude/X for spatial data are supported.
- Only one key is used to search for the parent record in relationships that require search. Multiple keys are not currently supported.
- Nested arrays are not supported for ID generation and relationships.
- Arrays without index attributes are not supported by any of the Excel handlers.
- Metablocks should not be created when the attributes used in the propertyvalues are not present in the record.
- Spatial location blocks are only partially created if the attributes used to create the spatial location are missing in the schema. ]()