#
Component Reference
#
How to use this reference
In the following reference, we provide a brief description of each Component, along with a table detailing each of its parameters and an example.
Where scoping operations occur or entries are added to the arguments
map, we detail these in the Scope
column of each parameter table.
A scope entry of Default
means the same scope is used as the spec that invokes it.
We provide examples in the following form:
{
"spec": ...,
"input": ...,
"output": ...
}
where spec
is the spec of the Protean transform and input
/output
are the example input/output json objects.
We may also include properties
for the properties map where applicable.
#
List of Components
#
as_list
Resolves to a list representation of given values, suppressing automatic unboxing of singleton lists.
#
Parameters
#
Example
{
"spec": {
"list_1": {
"#type": "as_list",
"values": "$.input_list"
},
"list_2": {
"#type": "as_list",
"values": "$.input_value"
}
},
"input": {
"input_list": [
"Only one element"
],
"input_value": "A value"
},
"output": {
"list_1": [
"Only one element"
],
"list_2": [
"A value"
]
}
}
#
as_string
Resolves to the given value interpreted as or converted to a string, as applicable.
The output of this component depends on the configured value of Config::toStringMethod
, the default being java.lang.Object::toString
.
#
Parameters
#
Example
{
"spec": {
"strings": {
"#type": "for_each",
"values": "$.values_from_input",
"spec": {
"#type": "as_string",
"value": "$"
}
}
},
"input": {
"values_from_input": [
"A string",
3,
true,
{
"key": "value"
},
[
"A sub-list",
7,
false
]
]
},
"output": {
"strings": [
"A string",
"3",
"true",
"{key=value}",
"[\"A sub-list\",7,false]"
]
}
}
#
as_value
Resolves to a single value representation of a given value using the configured default accumulator
(for Glyphs and the transform endpoint, the accumulator Accumulators::first
is used).
#
Parameters
#
Example
{
"spec": {
"value_1": {
"#type": "as_value",
"value": "$.input_list"
},
"value_2": {
"#type": "as_value",
"value": "$.input_value"
}
},
"input": {
"input_list": [
"First element",
"Second element"
],
"input_value": "A value"
},
"output": {
"value_1": "First element",
"value_2": "A value"
}
}
#
cache
Resolves to the result of a caching operation against the provided key, possibly using a provided value loader for write operations.
The cache operation can be one of the following: get, get if present, get or load, put, and evict. More details for each
operation can be found in the CacheOperation
table.
#
Parameters
#
CacheOperation
#
Example
{
"spec": {
"#type": "chain",
"chain": [
{
"put": {
"#type": "cache",
"operation": "put",
"key": "cache_key",
"value": "$.string"
}
},
{
"#type": "merge",
"values": [
"$",
{
"get": {
"#type": "cache",
"operation": "get",
"key": "cache_key"
}
}
]
}
]
},
"input": {
"string": "Hello, world!"
},
"output": {
"put": "Hello, world!",
"get": "Hello, world!"
}
}
#
chain
Resolves to the result of applying a sequential chain of specs where the output of one is passed into the input of the next.
#
Parameters
#
Example
{
"spec": {
"result": {
"#type": "chain",
"chain": [
"$.string_from_input",
{
"#type": "string_to_case",
"value": "$",
"case": "lower"
},
{
"#type": "string_split",
"value": "$",
"delimiter": "[\\W_]"
},
{
"#type": "sort",
"values": "$[?(@ != '')]"
},
{
"#type": "for_each",
"values": "$",
"spec": {
"#type": "string_join",
"values": [
"{ ",
"{&.indices[-2]}",
": '",
"$",
"' }"
]
}
},
{
"#type": "string_join",
"values": "$",
"delimiter": "; "
},
"My Result: {$}"
]
}
},
"input": {
"string_from_input": "Hello, world! Is this a string? Maybe"
},
"output": {
"result": "My Result: { 0: 'a' }; { 1: 'hello' }; { 2: 'is' }; { 3: 'maybe' }; { 4: 'string' }; { 5: 'this' }; { 6: 'world' }"
}
}
#
csv_to_json
Resolves to a list of maps that is a JSON representation of the given CSV file.
The keys of each map are the headers of the CSV file and the values are populated by the corresponding data in each row.
If no headers are present in the CSV file and none are provided in the headers
parameter of this component spec,
then numeric headers are used equal to the zero-based ordinal of each column converted to a string.
The output of this component and the interpretation of the value
parameter depends on the implementation of CsvProvider
assigned to the config field Config::csvProvider
.
By default, an implementation based on Jackson's CSV data format deserializer is used where value
is taken to be a path to the CSV file.
#
Parameters
#
Example
For the following example, let example_file.csv
be a CSV file with the following contents:
example_file.csv
id,name,description
1,Dog,An animal that goes woof
2,Cat,An animal that goes meow
3,Fox,An animal that goes ???
Example:
{
"spec": {
"animals": {
"#type": "csv_to_json",
"value": "$.input_file"
}
},
"input": {
"input_file": "example_file.csv"
},
"output": {
"animals": [
{
"id": "1",
"name": "Dog",
"description": "An animal that goes woof"
},
{
"id": "2",
"name": "Cat",
"description": "An animal that goes meow"
},
{
"id": "3",
"name": "Fox",
"description": "An animal that goes ???"
}
]
}
}
#
date_format
Resolves to a formatted string date for the given date object and format string.
This component will typically be used in conjunction with a date_parse
, date_now
or date_from_millis
component.
#
Parameters
#
Example
{
"spec": {
"date": {
"#type": "date_format",
"value": {
"#type": "date_parse",
"value": "$.input_date",
"format": "yyyy-MM-dd HH:mm:ss"
},
"format": "d/M/y"
}
},
"input": {
"input_date": "1992-05-07 16:32:08"
},
"output": {
"date": "7/5/1992"
}
}
#
date_from_millis
Resolves to a java.util.Date
object that represents a given date epoch millis.
This component will typically be used in conjunction with a date_format
or date_to_millis
component.
Otherwise, the result when serialized to JSON will depend on the ObjectMapper
used to render the output.
#
Parameters
#
Example
{
"spec": {
"date": {
"#type": "date_format",
"value": {
"#type": "date_from_millis",
"value": "$.input_date"
},
"format": "d/M/y"
}
},
"input": {
"input_date": 705256328000
},
"output": {
"date": "7/5/1992"
}
}
#
date_now
Resolves to a java.util.Date
object that represents the current date/time.
This component will typically be used in conjunction with a date_format
or date_to_millis
component.
Otherwise, the result when serialized to JSON will depend on the ObjectMapper
used to render the output.
#
Parameters
This component has no parameters.
#
Example
{
"spec": {
"date": {
"#type": "date_format",
"value": {
"#type": "date_now"
},
"format": "d/M/y"
}
},
"output": {
"date": "16/12/2024"
}
}
#
date_parse
Resolves to a java.util.Date
object that represents a given date string value in a given format.
This component will typically be used in conjunction with a date_format
or date_to_millis
component.
Otherwise, the result when serialized to JSON will depend on the ObjectMapper
used to render the output.
#
Parameters
#
Example
{
"spec": {
"date": {
"#type": "date_parse",
"value": "$.input_date",
"format": "d!M!y"
}
},
"input": {
"input_date": "7!5!92"
},
"output": {
"date": "1992-05-07T00:00:00.000+00:00"
}
}
#
date_to_millis
Resolves to a long equal to the standard epoch timestamp in millis of a given date.
This component will typically be used in conjunction with a date_parse
, date_now
or date_from_millis
component.
#
Parameters
#
Example
{
"spec": {
"date": {
"#type": "date_to_millis",
"value": {
"#type": "date_parse",
"value": "$.input_date",
"format": "yyyy-MM-dd HH:mm:ss"
}
}
},
"input": {
"input_date": "1992-05-07 16:32:08"
},
"output": {
"date": 705256328000
}
}
#
declare
Resolves to a given value after having added the provided entries to the arguments
map.
These can then be referenced using the &
source specifier (see Atomic Paths).
#
Parameters
#
Example
{
"spec": {
"#type": "declare",
"args": {
"define_arg": "$.value_from_input",
"another_arg": 3
},
"value": {
"dereference_args": "{&.define_arg} My favourite number is {&.another_arg}"
}
},
"input": {
"value_from_input": "Hello, world!"
},
"output": {
"dereference_args": "Hello, world! My favourite number is 3"
}
}
#
decrypt
Resolves to the plaintext string that is the result of decrypting the given cyphertext value using the provided encryption key.
The encryption/decryption protocol depends on the implementation of EncryptionProvider
that is set in the Config
field, encryptionProvider
.
The default implementation is DisallowedEncryptionProvider
,which throws a DisallowedOperationException
if an encrypt
/decrypt
component is used, so an alternative implementation must be provided in order to enable the encryption feature.
An implementation, JavaAESEncryptionProvider
, is included in the proteus-core
library, which is based on
Java AES and uses a cipher blocking chain (CBC) algorithm with a random initialization vector.
In this encryption provider, the key
parameter is a plaintext password used along with a configurable salt value
to generate the full encryption key.
#
Parameters
#
Example
{
"spec": {
"plaintext": {
"#type": "decrypt",
"value": "JtmdXxTsEg4ZLE9bWQ9JqhQhAZ1fsuC8xH//2l7Iusk=",
"key": "mykey123"
}
},
"output": {
"plaintext": "Hello, world!"
}
}
#
distinct
Resolves to a list composed of the distinct values of a given input list.
#
Parameters
#
Example
{
"spec": {
"strings": {
"#type": "distinct",
"values": "$.strings_with_repetitions"
},
"integers": {
"#type": "distinct",
"values": "$.integers_with_repetitions"
}
},
"input": {
"strings_with_repetitions": [
"dog",
"cat",
"dog",
"mouse"
],
"integers_with_repetitions": [
1,
2,
3,
2,
3
]
},
"output": {
"strings": [
"dog",
"cat",
"mouse"
],
"integers": [
1,
2,
3
]
}
}
#
encrypt
Resolves to the cyphertext string that is the result of encrypting the given value using the provided encryption key.
The encryption protocol depends on the EncryptionProvider
set in the configuration property Config::encryptionProvider
(see
#
Parameters
#
Example
{
"spec": {
"cyphertext": {
"#type": "encrypt",
"value": "Hello, world!",
"key": "mykey123"
}
},
"output": {
"cyphertext": "GpjihpRAO7Yw129rsQpf7Sg0Ag2JKQL7M1KZCS/fLqA="
}
}
#
entries
Resolves to a list of map entries for a given map where the key/value label for each entry can be customized.
#
Parameters
#
Example
{
"spec": {
"entries": {
"#type": "entries",
"map": "$.map",
"key": "id",
"value": "name"
}
},
"input": {
"map": {
"key_1": "value_1",
"key_2": "value_2",
"key_3": "value_3"
}
},
"output": {
"entries": [
{
"id": "key_1",
"name": "value_1"
},
{
"id": "key_2",
"name": "value_2"
},
{
"id": "key_3",
"name": "value_3"
}
]
}
}
#
fallback
Resolves to the output of the first component in a given list of fallback strategy components that does not output null nor throws an exception.
#
Parameters
#
Example
{
"spec": {
"fallback": {
"#type": "fallback",
"strategies": [
"$.non_existent",
"$.existent"
]
}
},
"input": {
"existent": "Hello, world!"
},
"output": {
"fallback": "Hello, world!"
}
}
#
file_delete
Resolves to true
after having deleted the file at the given path, or false
if the file already does not exist,
or null
if the provided path is null or an error occurs while deleting and the deletion is not required.
The config property enableFileDelete
must be set to true
to enable this component.
#
Parameters
#
Example
For the following example, let there be a file on the local file system with path /data/exists.txt
and no file with path /data/not_exists.txt
.
{
"spec": {
"results": {
"#type": "for_each",
"values": "$.files",
"spec": {
"path": "$",
"deleted": {
"#type": "file_delete",
"path": "$"
}
}
}
},
"input": {
"files": [
"/data/exists.txt",
"/data/not_exists.txt"
]
},
"output": {
"results": [
{
"path": "/data/exists.txt",
"deleted": true
},
{
"path": "/data/not_exists.txt",
"deleted": false
}
]
}
}
#
file_details
Resolves to a data structure that represents details of the file tree at the given path.
Each node in the file tree is represented by a JSON object with the following schema:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"file",
"directory"
],
"description": "The type of file node. I.e. whether it is a file or a directory."
},
"name": {
"type": "string",
"description": "The name of the file without any parent directories. I.e. for '/a/b/c/file.txt', the name would be 'file.txt'."
},
"base_name": {
"type": "string",
"description": "The name of the file excluding the file extension. I.e. for '/a/b/c/file.txt', the base name would be 'file'."
},
"extension": {
"type": "string",
"description": "The file extension. I.e. for '/a/b/c/file.txt', the extension would be 'txt'."
},
"path": {
"type": "string",
"description": "The path of this file relative to the base file given by the 'path' parameter."
},
"absolute_path": {
"type": "string",
"description": "The absolute path of the file"
},
"last_modified": {
"type": "integer",
"description": "The date that the file was last modified in Unix epoch millis"
},
"size": {
"type": "integer",
"description": "The size of the file in bytes"
},
"files": {
"type": "array",
"description": "The files contained within this directory",
"items": {
"$ref": "#"
}
}
}
}
The config property enableFileRead
must be set to true
to enable this component.
#
Parameters
#
Example
For the following example, let the following file tree exist on the root file system, where the file contents are displayed as leaves of the tree:
data/:
file_1.txt: "Hello, world"
dir_1/:
file_2.txt: "I am a file!"
Example:
{
"spec": {
"as_tree": {
"#type": "file_details",
"path": "$.file"
},
"flat": {
"#type": "file_details",
"path": "$.file",
"flat": true
},
"custom": {
"#type": "file_details",
"path": "$.file",
"yield_file": {
"$.name": {
"#type": "switch",
"value": "$.type",
"cases": {
"directory": {
"#type": "merge",
"values": "$.files"
},
"file": {
"#type": "file_read",
"path": "$.absolute_path"
}
}
}
}
}
},
"input": {
"file": "/data"
},
"output": {
"as_tree": {
"type": "directory",
"name": "data",
"base_name": "data",
"absolute_path": "/data",
"last_modified": 1734368271882,
"size": 4096,
"files": [
{
"type": "directory",
"name": "dir_1",
"base_name": "dir_1",
"path": "dir_1",
"absolute_path": "/data/dir_1",
"last_modified": 1734368300385,
"size": 4096,
"files": [
{
"type": "file",
"name": "file_2.txt",
"base_name": "file_2",
"extension": "txt",
"path": "dir_1/file_2.txt",
"absolute_path": "/data/dir_1/file_2.txt",
"last_modified": 1734369225225,
"size": 12,
"files": []
},
{
"type": "directory",
"name": "dir_2",
"base_name": "dir_2",
"path": "dir_1/dir_2",
"absolute_path": "/data/dir_1/dir_2",
"last_modified": 1734368300385,
"size": 4096,
"files": []
}
]
},
{
"type": "file",
"name": "file_1.txt",
"base_name": "file_1",
"extension": "txt",
"path": "file_1.txt",
"absolute_path": "/data/file_1.txt",
"last_modified": 1734369165094,
"size": 13,
"files": []
}
]
},
"flat": [
{
"type": "directory",
"name": "data",
"base_name": "data",
"absolute_path": "/data",
"last_modified": 1734368271882,
"size": 4096,
"files": []
},
{
"type": "directory",
"name": "dir_1",
"base_name": "dir_1",
"path": "dir_1",
"absolute_path": "/data/dir_1",
"last_modified": 1734368300385,
"size": 4096,
"files": []
},
{
"type": "file",
"name": "file_2.txt",
"base_name": "file_2",
"extension": "txt",
"path": "dir_1/file_2.txt",
"absolute_path": "/data/dir_1/file_2.txt",
"last_modified": 1734369225225,
"size": 12,
"files": []
},
{
"type": "directory",
"name": "dir_2",
"base_name": "dir_2",
"path": "dir_1/dir_2",
"absolute_path": "/data/dir_1/dir_2",
"last_modified": 1734368300385,
"size": 4096,
"files": []
},
{
"type": "file",
"name": "file_1.txt",
"base_name": "file_1",
"extension": "txt",
"path": "file_1.txt",
"absolute_path": "/data/file_1.txt",
"last_modified": 1734369165094,
"size": 13,
"files": []
}
],
"custom": {
"data": {
"dir_1": {
"file_2.txt": "I am a file!"
},
"file_1.txt": "Hello, world!"
}
}
}
}
#
file_read
Resolves to the full string content of the file at the specified path.
The config property enableFileRead
must be set to true
to enable this component.
The output of this component depends on which implementation of the SPI FileContentReader
is specified in Config
.
By default, DefaultFileContentReader
is used, which reads the file using java.nio.file.Files::readString
.
#
Parameters
#
Example
For the following example, let example_file.txt
be a file whose content is the following:
example_file.txt
Hello, world!
Example:
{
"spec": {
"my_file_contents": {
"#type": "file_read",
"path": "$.my_file_path",
"require_read": true
}
},
"input": {
"my_file_path": "example_file.txt"
},
"output": {
"my_file_contents": "Hello, world!"
}
}
#
file_write
Resolves to a given value after having written the specified string content to a file.
This component resolves to null
if path
or content
is or resolves to null
, or an error occurred while writing
the content and require_write
is false
.
The config property enableFileWrite
must be set to true
to enable this component.
#
Parameters
#
Example
{
"spec": {
"result": {
"#type": "chain",
"chain": [
{
"#type": "file_write",
"path": "{$.base_directory}/{$.file_name}",
"content": "$.text",
"value": "Successful"
},
{
"write": "$",
"read": {
"#type": "file_read",
"path": "{$$.base_directory}/{$$.file_name}"
}
}
]
}
},
"input": {
"base_directory": "/data",
"file_name": "output.txt",
"text": "Hello, world!"
},
"output": {
"result": {
"write": "Successful",
"read": "Hello, world!"
}
}
}
#
flatten
Resolves to a flattened list of a given (potentially nested) list.
If the values
parameter spec is not a list, this component will resolve to the unchanged output of values
.
Otherwise, it will recursively unpack all nested lists of values
in situ, such that no element of the output list will be a list.
#
Parameters
#
Example
{
"spec": {
"flattened_list": {
"#type": "flatten",
"values": [
"a",
[
"b",
"c"
],
[
[
"d",
"e"
],
"$.list_from_input"
]
]
}
},
"input": {
"list_from_input": [
"f",
[
"g"
]
]
},
"output": {
"flattened_list": [
"a",
"b",
"c",
"d",
"e",
"f",
"g"
]
}
}
#
for_each
Resolves to a list formed by iterating over given list and replacing each element with the output of given component spec applied to that element.
#
Parameters
#
Example
{
"spec": {
"output_list": {
"#type": "for_each",
"values": "$.input_list",
"spec": {
"template": "<span class=\"{$.type}\">Name: {$.name}</span>",
"id": "$.id",
"parent": "$$.id"
}
}
},
"input": {
"id": "record-123",
"input_list": [
{
"id": "agent-123",
"type": "person",
"name": "Bob"
},
{
"id": "agent-456",
"type": "cat",
"name": "Felix"
},
{
"id": "object-123",
"type": "artifact",
"name": "The Rosetta Stone"
}
]
},
"output": {
"output_list": [
{
"template": "<span class=\"person\">Name: Bob</span>",
"id": "agent-123",
"parent": "record-123"
},
{
"template": "<span class=\"cat\">Name: Felix</span>",
"id": "agent-456",
"parent": "record-123"
},
{
"template": "<span class=\"artifact\">Name: The Rosetta Stone</span>",
"id": "object-123",
"parent": "record-123"
}
]
}
}
#
generate_id
Resolves to a newly generated id string.
The method of generating ids depends on the IdGenerator
that is set in the Config
field idGenerator
.
The default setting disallows id generation, so an alternative IdGenerator
must be specified to enable the
generate_id
component.
#
Parameters
This component has no parameters
#
Example
{
"spec": {
"id": {
"#type": "generate_id"
}
},
"output": {
"id": "bf4ef807-d12d-36e8-8f01-002a331c6239"
}
}
#
group
Resolves to a map that is the result of grouping a given list by a given key provider or by a default key if none is found in the element, optionally applying a spec to each group element and to each group.
Note that null keys are disallowed in the output of this component, so if no key is found for an element and the default key is null, then the element will be dropped from the result.
#
Parameters
#
Example
{
"spec": {
"grouped_list": {
"#type": "group",
"values": "$.list",
"by": "$.type",
"default_key": "(no type)",
"yield_element": {
"id": "$.id",
"colour": "$.colour"
},
"yield_group": {
"#type": "group",
"values": "$",
"by": "$.colour",
"default_key": "(no colour)",
"yield_element": "$.id"
}
}
},
"input": {
"list": [
{
"id": "item-1",
"type": "car",
"colour": "blue"
},
{
"id": "item-2",
"type": "car",
"colour": "green"
},
{
"id": "item-3",
"type": "bike",
"colour": "blue"
},
{
"id": "item-4",
"type": "bike",
"colour": "red"
},
{
"id": "item-5",
"type": "train"
},
{
"id": "item-6",
"colour": "green"
}
]
},
"output": {
"grouped_list": {
"(no type)": {
"green": [
"item-6"
]
},
"bike": {
"blue": [
"item-3"
],
"red": [
"item-4"
]
},
"car": {
"blue": [
"item-1"
],
"green": [
"item-2"
]
},
"train": {
"(no colour)": [
"item-5"
]
}
}
}
}
#
http_request
Resolves to the response body of an HTTP request made against the given URL using the given request parameters.
The output will depend on the specific implementation of the interface HttpClientFacade
assigned to Config::httpClient
.
The default implementation is based on the OkHttp client.
#
Parameters
#
HttpMethod
#
Example
{
"spec": {
"response": {
"#type": "http_request",
"method": "POST",
"url": "https://dummyjson.com/products/add",
"headers": {
"Content-Type": "application/json"
},
"body": {
"title": "$.product_title"
},
"timeout": 4000,
"allow": 200,
"require_allowed": true,
"require_response": true
}
},
"input": {
"product_title": "pencil"
},
"output": {
"response": {
"id": 101,
"title": "pencil"
}
}
}
#
invoke
Resolves to the output of a given JSON object interpreted as a Protean transform, optionally with given arguments added to the arguments
map.
This can be used to invoke a spec embedded in an input document, for instance.
#
Parameters
#
Example
{
"spec": {
"invocation_output": {
"#type": "invoke",
"spec": "$.spec_embedded_in_input",
"args": {
"message": "Hello, world!"
}
}
},
"input": {
"list_from_input": [
"value-1",
"value-2",
"value-3"
],
"type": "object",
"spec_embedded_in_input": {
"#type": "for_each",
"values": "$.list_from_input",
"spec": {
"value": "$",
"type": "$$.type",
"message": "&.message"
}
}
},
"output": {
"invocation_output": [
{
"value": "value-1",
"type": "object",
"message": "Hello, world!"
},
{
"value": "value-2",
"type": "object",
"message": "Hello, world!"
},
{
"value": "value-3",
"type": "object",
"message": "Hello, world!"
}
]
}
}
#
json_path_details
Resolves to a data structure that provides details on the given JSONPath string.
The output conforms to the following JSON Schema:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"path": {
"type": "string",
"description": "The canonical representation of the JSONPath as a string"
},
"tokens": {
"type": "array",
"description": "The list of tokens that form the JSONPath",
"items": {
"type": "object",
"properties": {
"type": {
"type": "string",
"description": "The type of the path token",
"enum": [
"predicate_path",
"root_path",
"function_path",
"array_slice",
"array_index",
"property_path",
"scan_path",
"wildcard_path"
]
},
"value": {
"type": "string",
"description": "The string value of the path token"
},
"definite": {
"type": "boolean",
"description": "True if the path token is definite, false otherwise"
}
}
}
},
"definite": {
"type": "boolean",
"description": "True if the path is definite, false otherwise"
}
}
}
#
Parameters
#
Example
{
"spec": {
"result": {
"#type": "json_path_details",
"path": "$.input_path"
}
},
"input": {
"input_path": "$.objects[*].items[?(@.colour == 'blue')]['children'][0]..names[:3].length()"
},
"output": {
"result": {
"path": "$['objects'][*]['items'][?(@['colour'] == 'blue')]['children'][0]..['names'][:3].length()",
"tokens": [
{
"type": "root_path",
"value": "$",
"definite": true
},
{
"type": "property_path",
"value": "['objects']",
"definite": true
},
{
"type": "wildcard_path",
"value": "[*]",
"definite": false
},
{
"type": "property_path",
"value": "['items']",
"definite": true
},
{
"type": "predicate_path",
"value": "[?(@['colour'] == 'blue')]",
"definite": false
},
{
"type": "property_path",
"value": "['children']",
"definite": true
},
{
"type": "array_index",
"value": "[0]",
"definite": true
},
{
"type": "scan_path",
"value": "..",
"definite": false
},
{
"type": "property_path",
"value": "['names']",
"definite": true
},
{
"type": "array_slice",
"value": "[:3]",
"definite": false
},
{
"type": "function_path",
"value": ".length()",
"definite": true
}
],
"definite": false
}
}
}
#
json_to_string
Resolves to the JSON string representation of a given input value.
#
Parameters
#
Example
{
"spec": {
"result": {
"#type": "json_to_string",
"value": "$.input_object"
}
},
"input": {
"input_object": {
"key": "value"
}
},
"output": {
"result": "{\"key\":\"value\"}"
}
}
#
leaves
Resolves to a data structure that represents the leaves of the JSON tree for a given input value.
The output conforms to the following JSON Schema:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "array",
"description": "The list of leaves",
"items": {
"type": "object",
"description": "A leaf",
"properties": {
"value": {
"description": "The value of the leaf",
"oneOf": [
{
"type": "string"
},
{
"type": "number"
},
{
"type": "boolean"
},
{
"type": "null"
}
]
},
"path": {
"type": "object",
"description": "The path where the leaf is located in the input JSON structure",
"properties": {
"value": {
"type": "string",
"description": "The path rendered as a definite JSONPath string"
},
"elements": {
"type": "array",
"description": "The elements that form the path",
"items": {
"description": "An element of the path",
"allOf": [
{
"type": "object",
"properties": {
"type": {
"description": "The type of path element i.e. whether it is an array index or a string key"
},
"value": {
"description": "The value of the element i.e. the index or key"
},
"path_fragment": {
"type": "string",
"description": "The element rendered as a JSONPath fragment"
}
}
},
{
"oneOf": [
{
"properties": {
"type": {
"const": "key"
},
"value": {
"type": "string",
"description": "The string key"
}
}
},
{
"properties": {
"type": {
"const": "index"
},
"value": {
"type": "integer",
"description": "The array index"
}
}
}
]
}
]
}
}
}
}
}
}
}
#
Parameters
#
Example
{
"spec": {
"#type": "leaves",
"value": "$"
},
"input": {
"root": true,
"sort_key": 3,
"items": [
{
"id": "item-1",
"children": [
{
"id": "child-1",
"type": "green"
},
{
"id": "child-2",
"type": "blue"
}
]
}
]
},
"output": [
{
"value": true,
"path": {
"elements": [
{
"type": "key",
"value": "root",
"path_fragment": "['root']"
}
],
"value": "$['root']"
}
},
{
"value": 3,
"path": {
"elements": [
{
"type": "key",
"value": "sort_key",
"path_fragment": "['sort_key']"
}
],
"value": "$['sort_key']"
}
},
{
"value": "item-1",
"path": {
"elements": [
{
"type": "key",
"value": "items",
"path_fragment": "['items']"
},
{
"type": "index",
"value": 0,
"path_fragment": "[0]"
},
{
"type": "key",
"value": "id",
"path_fragment": "['id']"
}
],
"value": "$['items'][0]['id']"
}
},
{
"value": "child-1",
"path": {
"elements": [
{
"type": "key",
"value": "items",
"path_fragment": "['items']"
},
{
"type": "index",
"value": 0,
"path_fragment": "[0]"
},
{
"type": "key",
"value": "children",
"path_fragment": "['children']"
},
{
"type": "index",
"value": 0,
"path_fragment": "[0]"
},
{
"type": "key",
"value": "id",
"path_fragment": "['id']"
}
],
"value": "$['items'][0]['children'][0]['id']"
}
},
{
"value": "green",
"path": {
"elements": [
{
"type": "key",
"value": "items",
"path_fragment": "['items']"
},
{
"type": "index",
"value": 0,
"path_fragment": "[0]"
},
{
"type": "key",
"value": "children",
"path_fragment": "['children']"
},
{
"type": "index",
"value": 0,
"path_fragment": "[0]"
},
{
"type": "key",
"value": "type",
"path_fragment": "['type']"
}
],
"value": "$['items'][0]['children'][0]['type']"
}
},
{
"value": "child-2",
"path": {
"elements": [
{
"type": "key",
"value": "items",
"path_fragment": "['items']"
},
{
"type": "index",
"value": 0,
"path_fragment": "[0]"
},
{
"type": "key",
"value": "children",
"path_fragment": "['children']"
},
{
"type": "index",
"value": 1,
"path_fragment": "[1]"
},
{
"type": "key",
"value": "id",
"path_fragment": "['id']"
}
],
"value": "$['items'][0]['children'][1]['id']"
}
},
{
"value": "blue",
"path": {
"elements": [
{
"type": "key",
"value": "items",
"path_fragment": "['items']"
},
{
"type": "index",
"value": 0,
"path_fragment": "[0]"
},
{
"type": "key",
"value": "children",
"path_fragment": "['children']"
},
{
"type": "index",
"value": 1,
"path_fragment": "[1]"
},
{
"type": "key",
"value": "type",
"path_fragment": "['type']"
}
],
"value": "$['items'][0]['children'][1]['type']"
}
}
]
}
#
literal
Resolves to the given value literally, suppressing all JSONPath resolution and interpretive logic.
#
Parameters
#
Example
{
"spec": {
"literal_output": {
"#type": "literal",
"value": {
"do_not_resolve": "$.this.path",
"or_these": [
"$.path[0]",
"$.path[1]",
"$.path[2]"
],
"especially": {
"not": "$$.this.one"
}
}
}
},
"input": {
"this": {
"path": "Ignore me"
},
"path": [
"Also ignore me",
"And me",
"Me too"
]
},
"output": {
"do_not_resolve": "$.this.path",
"or_these": [
"$.path[0]",
"$.path[1]",
"$.path[2]"
],
"especially": {
"not": "$$.this.one"
}
}
}
#
maths
Resolves to the result of the given mathematical expression where the numeric type of the output can be optionally specified.
The output of this component depends on which implementation of the interface MathsEvaluator
is used, which can be
changed through the config field Config::mathsEvaluator
.
By default, an implementation based on exp4j is used.
#
Parameters
#
NumericType
#
Example
{
"spec": {
"result": {
"#type": "maths",
"expression": "3*({$.a} - 1)/({$.b} + 1)",
"output_type": "integer",
"require_evaluate": true
}
},
"input": {
"a": 11,
"b": 7.5
},
"output": {
"result": 3
}
}
#
merge
Resolves to a value that is the result of merging a list of values.
Each element of values
is merged into the output sequentially and recursively up to the given maximum depth (depth 1 by default) or without depth restriction by setting depth
equal to -1.
At each level of recursion, unless overridden by the type collision matrix (see below), pairs of objects are merged by combining the entries of each object, arrays are concatenated together and the last scalar/leaf node is taken to be the output value.
If a collision occurs between the types of the JSON nodes at a given path in any of the input values, then the collision resolution is determined by the type collision matrix, collisions
,
where the type of each node is considered to be one of the following: scalar
for single-valued primitives (e.g. string, int, bool), array
for arrays/lists or object
for objects/maps.
The possible type collision resolutions are:
first
/last
to select the value from the first/last element in each pair being merged;
as_array
to combine each value as if it were array (i.e. wrap scalars/objects as a singleton array and concatenate all values together);
or to_null
, which maps any colliding values to null.
The type collision matrix is assumed to be symmetric by default, so there is no need to specify the resolution for both scalar-array and array-scalar collisions as long as one is specified.
However, an asymmetric matrix can be specified by setting symmetric
to false
within the definition of collisions
.
Using the above example, this means the resolution can be different depending on whether the scalar or array involved in the collision occurs earlier or later in the values
input list.
#
Parameters
#
Default Type Collision Matrix
{
"symmetric": true
"scalar": {
"array": "as_array",
"object": "last"
},
"object": {
"array": "as_array"
}
}
#
Examples
Simple merge
{
"spec": {
"merge_result": {
"#type": "merge",
"values": [
"$.map_from_input",
{
"key_2": "replacement_value",
"key_4": "additional_value"
}
]
}
},
"input": {
"map_from_input": {
"key_1": "value_1",
"key_2": "value_2",
"key_3": "value_3"
}
},
"output": {
"merge_result": {
"key_1": "value_1",
"key_2": "replacement_value",
"key_3": "value_3",
"key_4": "additional_value"
}
}
}
Recursive merge with type collision matrix
{
"spec": {
"merge_result": {
"#type": "merge",
"depth": -1,
"collisions": {
"scalar": {
"object": "as_array"
}
},
"values": [
{
"root": {
"level": 1,
"child": {
"level": 2,
"type": "first",
"values": "scalar",
"name": "John Smith"
}
}
},
{
"root": {
"child": {
"type": "last",
"values": "$.additional_child_values",
"name": {
"first": "John",
"last": "Smith"
}
}
}
}
]
}
},
"input": {
"additional_child_values": [
"array element 1",
"array element 2",
"array element 3"
]
},
"output": {
"merge_result": {
"root": {
"level": 1,
"child": {
"level": 2,
"type": "last",
"values": [
"scalar",
"array element 1",
"array element 2",
"array element 3"
],
"name": [
"John Smith",
{
"first": "John",
"last": "Smith"
}
]
}
}
}
}
}
#
regex_groups
Resolves to result of evaluating a regular expression against an input string and transforming each match into a map of capture groups.
By default, the keys for each map entry will either be the name or number of the capture group, but this can be overridden
using the groups
parameter.
If the provided pattern only matches once, the output will be a single map. Otherwise, a list of maps is produced.
#
Parameters
The pattern
parameter will often have to be wrapped in a literal
component due to overlap between the special characters
of regular expressions and JSONPath/Proteus.
#
Example
{
"spec": {
"simple_example": {
"#type": "regex_groups",
"value": "$.person",
"pattern": {
"#type": "literal",
"value": "^(?<forename>\\w+)\\s+(?<surname>\\w+)\\s+\\((?<birthYear>\\d{4})-(?<deathYear>\\d{4})\\)$"
}
},
"example_with_groups_param": {
"#type": "regex_groups",
"value": "$.date",
"pattern": {
"#type": "literal",
"value": "(\\d{1,2}(?:st|nd|rd|th))\\s+(?<middle>\\w{3})\\s+(\\d{4})"
},
"groups": {
"day": 1,
"month": "middle",
"year": 3
}
}
},
"input": {
"person": "John Smith (1950-2000)",
"date": "1st Jan 1990"
},
"output": {
"simple_example": {
"forename": "John",
"surname": "Smith",
"birthYear": "1950",
"deathYear": "2000"
},
"example_with_groups_param": {
"day": "1st",
"month": "Jan",
"year": "1990"
}
}
}
#
regex_replace
Resolves to the result of replacing all regular expression matches found within the given input string with the given replacement pattern.
#
Parameters
#
Example
{
"spec": {
"result": {
"#type": "regex_replace",
"value": {
"#type": "regex_replace",
"value": "$.value_from_input",
"pattern": "^(.*?):(.*)$",
"replacement": "Name is '$1' - Description is '$2'"
},
"pattern": "_",
"replacement": " "
}
},
"input": {
"value_from_input": "Cat:A_small_mammal_that_goes_meow"
},
"output": {
"result": "Name is 'Cat' - Description is 'A small mammal that goes meow'"
}
}
#
require_catch
Resolves to the given value unless the component from which the value is derived throws a RequirementNotMetException
,
in which case the exception is caught and the component resolves to the output of or_else
.
If or_else
is used, the argument required
is added to the arguments
map, which provides information on the requirement that was not met.
The specific value this argument takes depends on the component that throws the exception.
#
Parameters
#
Example
{
"spec": {
"#type": "require_catch",
"value": {
"result": "[R]$.non_existent"
},
"or_else": {
"result": "$.default",
"warning": "Using default result due to unmet requirement: {&.required}"
}
},
"input": {
"default": 123
},
"output": {
"result": 123,
"warning": "Using default result due to unmet requirement: [R]$.non_existent"
}
}
#
require
Resolves to the given value if a given required value is non-null. Otherwise, it resolves to null.
#
Parameters
#
Example
{
"spec": {
"required_is_non_null": {
"#type": "require",
"require": "$.existent",
"value": true
},
"required_is_null": {
"#type": "require",
"require": "$.non_existent",
"value": true
}
},
"input": {
"existent": "I exist"
},
"output": {
"required_is_non_null": true
}
}
#
require_string
Resolves to the given value if it is a string or throws a RequirementNotMetException
otherwise.
#
Parameters
#
Example
{
"spec": {
"results": {
"#type": "for_each",
"values": "$.values_from_input",
"spec": {
"#type": "require_catch",
"value": {
"#type": "require_string",
"value": "$"
},
"or_else": "&.message"
}
}
},
"input": {
"values_from_input": [
"This is a string",
3
]
},
"output": {
"results": [
"This is a string",
"Value is not string: 3"
]
}
}
#
require_throw
Resolves to the given value if it is non-null. Otherwise, a RequirementNotMetException
is thrown.
#
Parameters
#
Example
{
"spec": {
"results": {
"#type": "for_each",
"values": "$.my_list",
"spec": {
"#type": "require_catch",
"value": {
"success": {
"a_static_value": "Some string",
"a_required_string": {
"#type": "require_throw",
"value": "$.type",
"args": {
"missing_field": "&.keys[-2]"
}
},
"a_required_map": {
"#type": "require_throw",
"value": {
"key_1": "$.may_exist_1",
"key_2": "$.may_exist_2"
},
"message": "All JSONPaths in 'a_required_map' failed to resolve",
"args": {
"missing_field": "&.keys[-2]"
}
}
}
},
"or_else": {
"error": {
"message": {
"#type": "fallback",
"strategies": [
"&.message",
"%.error_message"
]
},
"details": "Missing field is '{&.missing_field}'"
}
}
}
}
},
"input": {
"my_list": [
{
"type": "item",
"may_exist_1": "I think"
},
{
"type": "item",
"may_exist_2": "therefore I am"
},
{
"may_exist_1": "Won't render because 'type' is missing"
},
{
"type": "Won't render because both 'may_exist_1' and 'may_exist_2' are missing"
}
]
},
"config": {
"properties": {
"error_message": "Failed to apply transform"
}
},
"output": {
"results": [
{
"success": {
"a_static_value": "Some string",
"a_required_string": "item",
"a_required_map": {
"key_1": "I think"
}
}
},
{
"success": {
"a_static_value": "Some string",
"a_required_string": "item",
"a_required_map": {
"key_2": "therefore I am"
}
}
},
{
"error": {
"message": "Failed to apply transform",
"details": "Missing field is 'a_required_string'"
}
},
{
"error": {
"message": "All JSONPaths in 'a_required_map' failed to resolve",
"details": "Missing field is 'a_required_map'"
}
}
]
}
}
#
root
Resolves to the output of a given component where the root of the input stack for the component is set to a given value.
#
Parameters
#
Example
{
"spec": {
"result": {
"value_default_scope": "$$.value",
"value_root_scope": {
"#type": "root",
"root": "$.object",
"spec": "$$.value"
}
}
},
"input": {
"value": "default",
"object": {
"value": "root"
}
},
"output": {
"result": {
"value_default_scope": "default",
"value_root_scope": "root"
}
}
}
#
scope
Resolves to the output of a given component where a given value is appended to the input stack.
Effectively, this sets the scope for value
to the output of scope
, so paths will target the scoped entity instead
of the root of the input JSON document.
#
Parameters
#
Example
{
"spec": {
"result": {
"innermost_level_using_full_path": "$.nested_objects.child.child.level",
"innermost_level_using_scopes": {
"#type": "scope",
"scope": "$.nested_objects",
"value": {
"#type": "scope",
"scope": "$.child",
"value": {
"#type": "scope",
"scope": "$.child",
"value": "$.level"
}
}
}
}
},
"input": {
"nested_objects": {
"level": 1,
"child": {
"level": 2,
"child": {
"level": 3
}
}
}
},
"output": {
"result": {
"innermost_level_using_full_path": 3,
"innermost_level_using_scopes": 3
}
}
}
#
select
Resolves to a map that is the result of including or excluding the given keys from the given input map.
#
Parameters
#
Policy
#
Example
{
"spec": {
"result": [
{
"#type": "select",
"value": "$.map",
"keys": [
"key_1",
"key_2"
]
},
{
"#type": "select",
"value": "$.map",
"policy": "exclude",
"keys": "key_2"
}
]
},
"input": {
"map": {
"key_1": "Value 1",
"key_2": "Value 2",
"key_3": "Value 3"
}
},
"output": {
"result": [
{
"key_1": "Value 1",
"key_2": "Value 2"
},
{
"key_1": "Value 1",
"key_3": "Value 3"
}
]
}
}
#
sort
Resolves to a given list of values sorted by the specified sort key in the specified direction (ascending or descending), where unsorted items are handled using the given behaviour.
#
Parameters
#
SortDirection
#
UnsortedBehaviour
#
Example
{
"spec": {
"result": {
"#type": "sort",
"values": "$.list",
"by": "$.sort_key",
"direction": "desc",
"unsorted": "first"
}
},
"input": {
"list": [
{
"value": "value_1",
"sort_key": 2
},
{
"value": "value_2",
"sort_key": 1
},
{
"value": "value_3",
"sort_key": 3
},
{
"value": "missing_type"
}
]
},
"output": {
"result": [
{
"value": "missing_type"
},
{
"value": "value_3",
"sort_key": 3
},
{
"value": "value_1",
"sort_key": 2
},
{
"value": "value_2",
"sort_key": 1
}
]
}
}
#
string_edit_distance
Resolves to the Levenshtein distance (a.k.a. edit distance) between two strings.
Optionally, a maximum distance threshold may be specified that causes the distance calculation to be aborted prematurely if a distance greater than the threshold is detected. In which case, this component resolves to -1.
#
Parameters
#
Example
{
"spec": {
"results": {
"#type": "for_each",
"values": "$.to",
"spec": {
"from": "$$.from",
"to": "$",
"distance": {
"#type": "string_edit_distance",
"from": "$$.from",
"to": "$",
"threshold": 6
}
}
}
},
"input": {
"from": "John Smith",
"to": [
"Jane Smith",
"Johnny Mit",
"Cheese"
]
},
"output": {
"results": [
{
"from": "John Smith",
"to": "Jane Smith",
"distance": 3
},
{
"from": "John Smith",
"to": "Johnny Mit",
"distance": 5
},
{
"from": "John Smith",
"to": "Cheese",
"distance": -1
}
]
}
}
#
string_join
Resolves to a single string that is the concatenation of a given list of strings, optionally joined by a delimiter.
If any element of values
is null, then the resulting behaviour depends on the value of on_null
.
#
Parameters
#
OnNullResponse
#
Example
{
"spec": {
"result": {
"#type": "string_join",
"values": [
"The quick",
"$.string_from_input",
null,
"jumps over {$.another_phrase}"
],
"delimiter": " ",
"on_null": "ignore"
}
},
"input": {
"string_from_input": "brown fox",
"another_phrase": "the lazy dog"
},
"output": {
"result": "The quick brown fox jumps over the lazy dog"
}
}
#
string_length
Resolves to the length of the given string as an integer.
#
Parameters
#
Example
{
"spec": {
"result": {
"#type": "string_length",
"value": "$.value_from_input"
}
},
"input": {
"value_from_input": "abcdefg"
},
"output": {
"result": 7
}
}
#
string_split
Resolves to an array of strings formed by splitting the given input string on the given regex pattern.
#
Parameters
#
Example
{
"spec": {
"result": {
"#type": "string_split",
"value": "$.string_from_input",
"delimiter": ","
}
},
"input": {
"string_from_input": "the,quick,brown,fox,jumped,over,the,lazy,dog"
},
"output": {
"result": [
"the",
"quick",
"brown",
"fox",
"jumped",
"over",
"the",
"lazy",
"dog"
]
}
}
#
string_to_case
Resolves to the result of converting the given input string to the given casing style.
#
Parameters
#
Case
#
Example
{
"spec": {
"results": {
"#type": "for_each",
"values": [
"lower",
"upper",
"lower_camel",
"upper_camel",
"lower_snake",
"upper_snake",
"lower_kebab",
"upper_kebab"
],
"spec": {
"#type": "string_to_case",
"value": "$$.mixed_casing",
"case": "$"
}
}
},
"input": {
"mixed_casing": "this is-A_stringWith Mixed-CASING"
},
"output": {
"results": [
"this is-a_stringwith mixed-casing",
"THIS IS-A_STRINGWITH MIXED-CASING",
"thisIsAStringWithMixedCasing",
"ThisIsAStringWithMixedCasing",
"this_is_a_string_with_mixed_casing",
"THIS_IS_A_STRING_WITH_MIXED_CASING",
"this-is-a-string-with-mixed-casing",
"THIS-IS-A-STRING-WITH-MIXED-CASING"
]
}
}
#
string_to_json
Resolves to the deserialised JSON entity represented by the given serialised JSON string.
#
Parameters
#
Example
{
"spec": {
"my_json_entity": {
"#type": "string_to_json",
"value": "$.my_json_string",
"require_convert": true
}
},
"input": {
"my_json_string": "{\"some_key\":\"some_value\"}"
},
"output": {
"my_json_entity": {
"some_key": "some_value"
}
}
}
#
switch
Resolves to the output of the spec within a given map of cases whose key matches the output of a given test value, or the output of a given default spec if no cases match.
The behaviour of this component is reminiscent of a "switch" statement common to many programming languages.
#
Parameters
#
Example
{
"spec": {
"animal_sounds": {
"#type": "for_each",
"values": "$.animals",
"spec": {
"#type": "switch",
"value": "$",
"cases": {
"cow": "moo",
"sheep": "bah",
"dog": "woof"
},
"default": "unknown"
}
}
},
"input": {
"animals": [
"sheep",
"dog",
"cow",
"fox"
]
},
"output": {
"animal_sounds": [
"bah",
"woof",
"moo",
"unknown"
]
}
}
#
to_uuid
Resolves to the name UUID string generated from the given value converted to bytes using the method UUID:nameUUIDFromBytes.
If the provided value is a string, the byte conversion is performed using the platform's default character set.
If the value is an integer or a long, it is converted to its representation as a big-endian byte sequence.
Otherwise, a RequirementNotMetException
is thrown with the argument error_type
populated with the value unsupported_type
.
#
Parameters
#
Example
{
"spec": {
"uuid": {
"#type": "to_uuid",
"value": "$.name"
}
},
"input": {
"name": "John Smith"
},
"output": {
"uuid": "6117323d-2cab-3c17-944c-2b44587f682c"
}
}
#
xml_to_json
Resolves to a JSON representation of the XML data provided in the given string.
The output of this component depends on which implementation of the SPI XmlToJsonConverter
is provided in Config
. By default, the implementation StandardXmlToJsonConverter
is used, which converts XML into JSON in a standardized schema
(see
#
Parameters
#
Example
{
"spec": {
"#type": "xml_to_json",
"xml": "$.my_xml_data",
"require_convert": true
},
"input": {
"my_xml_data": "<root><element>a</element><element ordinal_text=\"second\">b</element><element>c<note>I am mixed</note></element><comment>Hello, world!</comment></root>"
},
"output": {
"root": [
{
"element": [
{
"@content": "a"
},
{
"@attributes": {
"ordinal_text": "second"
},
"@content": "b"
},
{
"@mixed": [
{
"@content": "c"
},
{
"note": [
{
"@content": "I am mixed"
}
]
}
]
}
],
"comment": [
{
"@content": "Hello, world!"
}
]
}
]
}
}
#
StandardXmlToJsonConverter
The xml to json converter StandardXmlToJsonConverter
converts XML to JSON in a standardized schema,
where the reserved key @attributes
contains an XML element's attributes,
@content
contains the textual content of an element, and any other key contains the list of child elements whose tag
name is given by the key.
Mixed content (i.e. mixtures of text and elements as direct children of the same element) can be handled in a variety
of ways depending on the value of parameter mixed_content_mode
.
By default, each XML node that forms the mixed content is rendered independently using the above mapping and listed
in sequence under the reserved key @mixed
(see the example above).
#
Custom parameters
Custom parameters for StandardXmlToJsonConverter
are listed in the table below.
#
MixedContentMode
The accepted values for mixed_content_mode
are detailed in the following table.
#
StandardXmlToJsonConverterFeature
The accepted keys for the features
map are detailed in the following table.