# Getting Started

This project is subdivided into two main parts: the transform library, proteus-core, and the plugin for Rosetta, proteus-rosetta-plugin.

# Core Library

The library provides developers with the means to use Protean transforms in one's own projects. To get started you will need to include the following dependency:

<dependency>
    <groupId>com.k-int.proteus</groupId>
    <artifactId>proteus-core</artifactId>
    <version>3.0.0</version>
</dependency>

The following code snippet demonstrates how to apply a Protean transform defined in spec.json to an input file input.json with default settings:

// 1) Load the spec file. ComponentSpec provides static methods for loading from files, resources or input streams
ComponentSpec<Object> spec = ComponentSpec.loadFile("spec.json");

// 2) Create a context for the transform
Context context = Context.builder().spec(spec).build();

// 3) Load the input json file (using a Jackson ObjectMapper, for example) and wrap it in an Input instance
Input input = new Input(loadJson("input.json"));

// 4) Retrieve the result of the transform. First, we receive an Optional from getComponent, then we unbox it using orElse
Object result = context
    .inputMapper(input)
    .getComponent()
    .orElse(null);

In the above example, we leave loading the input json up to the implementer, but for completeness the following is an example implementation of loadJson:

static Object loadJson(String fileName) throws IOException
{
    return new ObjectMapper()
        .readValue(
            new FileInputStream(fileName),
            Object.class
    );
}

Note that the constructor for Input accepts any object that can be serialized to json using Jackson, e.g. a java.util.Map or JsonNode.

Given the following spec.json and input.json, the above example would result in a map whose json representation is given by result.json:

spec.json

{
  "my_value": "$.input_key"
}

input.json

{
  "input_key": "input_value"
}

result.json

{
  "my_value": "input_value"
}

To alter the config, pass a Config object into the Context builder:

Context context = Context.builder()
        .spec(spec)
        .config(
            Config.builder()
                .objectMapper(
                    new ObjectMapper()
                        .configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
                        .setPropertyNamingStrategy(PropertyNamingStrategies.SNAKE_CASE)
                )
                .properties(
                    Map.of(
                        "base_url",
                        "http://mysite.org/"
                    )
                )
                .pathNotFoundToNull(true)
                .build()
        )
        .build();

# Rosetta Plugin

The Rosetta plugin provides factories for Proteus Glyphs and Providers.

The Glyph applies Protean transforms to input objects provided by the Rosetta data service. The input schema and expected output depends on the context in which the Glyph is applied. E.g. if applied in the data phase of a Rosetta Profile, then the transform will act upon individual records and be expected to produce the new transformed record in full.

The Provider applies a Protean transform to incoming Rosetta requests, where the output must conform to the schema of a Rosetta ProviderResponse object.

The plugin also provides a RESTful POST endpoint at {rosetta base url}/proteus/transform for testing Protean transforms against input supplied in the request.

# Installation

To install the plugin, first you will need to obtain a fat JAR for the proteus-rosetta-plugin submodule. You can build this yourself from source, e.g. using the following

git clone --branch proteus-3.0.0 --depth 1 https://gitlab.com/knowledge-integration/ciim/proteus.git
cd proteus
mvn clean install

After this completes, the fat JAR will be located at ./proteus-rosetta-plugin/target/proteus-rosetta-plugin-3.0.0.jar.

Alternatively you can download a pre-built version from the k-int maven repository using

mvn dependency:copy -Dartifact=com.k-int.proteus:proteus-rosetta-plugin:3.0.0 -DoutputDirectory=./download/location

Once you have the JAR, you can install it into a Rosetta deployment by moving it into the plugins directory and restarting Rosetta.

For dockerized deployments, you can inject the plugin into the container's plugin directory from the host machine in the docker-compose.yml by listing the plugin JAR as a volume, as in the following.

version: "3.8"
services:
  rosetta:
    image: registry.gitlab.com/knowledge-integration/ciim/rosetta:latest
    volumes:
      - ./plugins/proteus-rosetta-plugin-3.0.0.jar:/rosetta/plugins/proteus-rosetta-plugin-3.0.0.jar
    ports:
      - "4923:4923"
    networks:
      - internal
    restart: always

Alternatively, you can build a new image with the plugin built in using the following Dockerfile:

FROM registry.gitlab.com/knowledge-integration/ciim/rosetta:latest
ADD proteus-rosetta-plugin-3.0.0.jar /rosetta/plugins
USER root
RUN chown -R rosetta:rosetta /rosetta/plugins/proteus-rosetta-plugin-3.0.0.jar
USER rosetta

# Configuration

To create a Proteus Glyph or Provider, add an entry to the appropriate list within Rosetta's application.yml using the corresponding entry type from table Entity types.

At a minimum, you will need to specify the entity name, type and path to the JSON file containing the Protean transform in the entry's property properties.spec.

Optional properties include: a properties map used to define external properties accessed using the % source specifier (see paths); a list of files to include that also populate the external properties; a Boolean flag that determines whether the entity is "mutable" i.e. whether the spec should be reloaded at runtime when it is changed; and other config fields from the class Config.

Examples of minimally- and fully-specified Glyph config follow. See Rosetta Entity Config Properties for complete config reference.

rosetta:
  transform:
    glyphs:
      - name: minimal_proteus_glyph
        type: proteus
        properties:
          spec: ./path/to/minimal_spec.json

      - name: full_proteus_glyph
        type: proteus
        input: full
        properties:
          spec: ./path/to/full_spec.json
          mutable: true
          config:
            properties:
              base_url: http://my_site.org/
              other_external_property: Some value
            path_not_found_to_null: false
            unbox_singletons: false
            reader_cache_max_size: 200
            type_registries:
              - $.my_custom_type_registry
              - $.my_other_type_registry
            http_client: java
            accumulator: last
            cache: proteus-cache
            encryption_provider: java_aes
            encryption_salt: abcdefg123456
          include:
            - ./path/to/external_properties.json
          log_missing_paths: true

# Transform Endpoint

The transform endpoint POST /proteus/transform allows you to execute a Protean transform using input and config supplied in the request. The request body is composed of a JSON object with three child nodes: spec, input and config, corresponding to the transform specification, input JSON and config, respectively. E.g.

Request body

{
  "spec": {
    "my_value": "$.input_key",
    "other_value": "%.property_key"
  },
  "input": {
    "input_key": "input_value"
  },
  "config": {
    "properties": {
      "property_key": "property_value"
    }
  }
}

Response body

{
  "my_value": "input_value",
  "other_value": "property_value"
}