#
Getting Started
This project is subdivided into two main parts: the transform library, proteus-core
, and the plugin for Rosetta,
proteus-rosetta-plugin
.
#
Core Library
The library provides developers with the means to use Protean transforms in one's own projects. To get started you will need to include the following dependency:
<dependency>
<groupId>com.k-int.proteus</groupId>
<artifactId>proteus-core</artifactId>
<version>3.0.0</version>
</dependency>
The following code snippet demonstrates how to apply a Protean transform defined in spec.json
to an input
file input.json
with default settings:
// 1) Load the spec file. ComponentSpec provides static methods for loading from files, resources or input streams
ComponentSpec<Object> spec = ComponentSpec.loadFile("spec.json");
// 2) Create a context for the transform
Context context = Context.builder().spec(spec).build();
// 3) Load the input json file (using a Jackson ObjectMapper, for example) and wrap it in an Input instance
Input input = new Input(loadJson("input.json"));
// 4) Retrieve the result of the transform. First, we receive an Optional from getComponent, then we unbox it using orElse
Object result = context
.inputMapper(input)
.getComponent()
.orElse(null);
In the above example, we leave loading the input json up to the implementer, but for completeness the following
is an example implementation of loadJson
:
static Object loadJson(String fileName) throws IOException
{
return new ObjectMapper()
.readValue(
new FileInputStream(fileName),
Object.class
);
}
Note that the constructor for Input
accepts any object that can be serialized to json using Jackson,
e.g. a java.util.Map
or JsonNode
.
Given the following spec.json
and input.json
, the above example would result in a map whose json representation is
given by result.json
:
spec.json
{
"my_value": "$.input_key"
}
input.json
{
"input_key": "input_value"
}
result.json
{
"my_value": "input_value"
}
To alter the config, pass a Config
object into the Context
builder:
Context context = Context.builder()
.spec(spec)
.config(
Config.builder()
.objectMapper(
new ObjectMapper()
.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
.setPropertyNamingStrategy(PropertyNamingStrategies.SNAKE_CASE)
)
.properties(
Map.of(
"base_url",
"http://mysite.org/"
)
)
.pathNotFoundToNull(true)
.build()
)
.build();
#
Rosetta Plugin
The Rosetta plugin provides factories for Proteus Glyphs and Providers.
The Glyph applies Protean transforms to input objects provided by the Rosetta data service. The input schema and
expected output depends on the context in which the Glyph is applied.
E.g. if applied in the data
phase of a Rosetta Profile, then the transform will act upon individual records and be
expected to produce the new transformed record in full.
The Provider applies a Protean transform to incoming Rosetta requests,
where the output must conform to the schema of a Rosetta ProviderResponse
object.
The plugin also provides a RESTful POST
endpoint at {rosetta base url}/proteus/transform
for testing Protean transforms against
input supplied in the request.
#
Installation
To install the plugin, first you will need to obtain a fat JAR for the proteus-rosetta-plugin
submodule. You can build this yourself from source, e.g. using the following
git clone --branch proteus-3.0.0 --depth 1 https://gitlab.com/knowledge-integration/ciim/proteus.git
cd proteus
mvn clean install
After this completes, the fat JAR will be located at ./proteus-rosetta-plugin/target/proteus-rosetta-plugin-3.0.0.jar
.
Alternatively you can download a pre-built version from the k-int maven repository using
mvn dependency:copy -Dartifact=com.k-int.proteus:proteus-rosetta-plugin:3.0.0 -DoutputDirectory=./download/location
Once you have the JAR, you can install it into a Rosetta deployment by moving it into the plugins
directory and
restarting Rosetta.
For dockerized deployments, you can inject the plugin into the container's plugin directory from the host machine in
the docker-compose.yml
by listing the plugin JAR as a volume, as in the following.
version: "3.8"
services:
rosetta:
image: registry.gitlab.com/knowledge-integration/ciim/rosetta:latest
volumes:
- ./plugins/proteus-rosetta-plugin-3.0.0.jar:/rosetta/plugins/proteus-rosetta-plugin-3.0.0.jar
ports:
- "4923:4923"
networks:
- internal
restart: always
Alternatively, you can build a new image with the plugin built in using the following Dockerfile:
FROM registry.gitlab.com/knowledge-integration/ciim/rosetta:latest
ADD proteus-rosetta-plugin-3.0.0.jar /rosetta/plugins
USER root
RUN chown -R rosetta:rosetta /rosetta/plugins/proteus-rosetta-plugin-3.0.0.jar
USER rosetta
#
Configuration
To create a Proteus Glyph or Provider, add an entry to the appropriate list within Rosetta's application.yml
using the corresponding entry type from table Entity types.
At a minimum, you will need to specify the entity name, type and path to the JSON file containing the Protean transform
in the entry's property properties.spec
.
Optional properties include: a properties map used to define external properties accessed using the %
source specifier (see paths);
a list of files to include that also populate the external properties;
a Boolean flag that determines whether the entity is "mutable" i.e. whether the spec should be reloaded at runtime when it is changed;
and other config fields from the class Config
.
Examples of minimally- and fully-specified Glyph config follow. See Rosetta Entity Config Properties for complete config reference.
rosetta:
transform:
glyphs:
- name: minimal_proteus_glyph
type: proteus
properties:
spec: ./path/to/minimal_spec.json
- name: full_proteus_glyph
type: proteus
input: full
properties:
spec: ./path/to/full_spec.json
mutable: true
config:
properties:
base_url: http://my_site.org/
other_external_property: Some value
path_not_found_to_null: false
unbox_singletons: false
reader_cache_max_size: 200
type_registries:
- $.my_custom_type_registry
- $.my_other_type_registry
http_client: java
accumulator: last
cache: proteus-cache
encryption_provider: java_aes
encryption_salt: abcdefg123456
include:
- ./path/to/external_properties.json
log_missing_paths: true
#
Transform Endpoint
The transform endpoint POST /proteus/transform
allows you to execute a Protean transform using input and config
supplied in the request.
The request body is composed of a JSON object with three child nodes: spec
, input
and config
, corresponding to the
transform specification, input JSON and config, respectively. E.g.
Request body
{
"spec": {
"my_value": "$.input_key",
"other_value": "%.property_key"
},
"input": {
"input_key": "input_value"
},
"config": {
"properties": {
"property_key": "property_value"
}
}
}
Response body
{
"my_value": "input_value",
"other_value": "property_value"
}
The component http_request
is disallowed by default. This can be changed by setting the configuration property rosetta.plugins.proteus.api.http_client
to either ok_http
or java
.