Working with Data

Working with Data

The Synchrony Project is not just for building and solving factor graphs. You can also store and link arbitrary data to nodes in the graph. The graph then becomes a natural index for all sensory (and processed data), so you can randomly access all sensory information across multiple sessions and multiple robots. Timestamped poses are naturally great indices for searching data in the time-domain, in locations or regions, or across multiple devices. Think shared, collective, successively growing memory :)

We're still working on the best ways to do this, but it's one our key missions: to provide you with a simple way to insert massive amounts of sensory data into the graph and efficiently query+extract it at some point in the future across multiple systems.

If you want to see the start of this at work, take a look at the Brookstone Rover example, where we:

Important Notes

A few important notes before we continue:

An Overview of Our Data Model

Consider that a single pose can have multiple raw data elements attached to it - a camera image, a lidar scan, an audio snippet. It can also have processed data elements that may be include once that processing is completed.

In our data model, we let you attach named data elements to any node. That means that data looks like a big dictionary, with some additional property information, like this:

In the following sections, we look at:

To start, let's assume we have a valid Synchrony configuration and have a node from a graph:

using Base
using SynchronySDK

# 1. Get a Synchrony configuration
# Assume that you're running in local directory
synchronyConfig = loadConfigFile("synchronyConfig.json")

robotId = "Hexagonal" # Update these
sessionId = "HexDemo1" # Update these

# Get all nodes and select the first for this example
sessionNodes = getNodes(synchronyConfig, robotId, sessionId);
if length(sessionNodes.nodes) == 0
  error("Please update the robotId and sessionId to give back some existing nodes, or run the hexagonal example to make a new dataset.")
end

# Get the first node - we don't need the complete node, just the summary, so no getNode call needed.
node = sessionNodes.nodes[1]

Listing All Data Entries in a Pose or Factor

We can extract all data entries with the getDataEntries method:

dataEntries = getDataEntries(synchronyConfig, robotId, sessionId, node)
@show dataEntries

dataEntry = nothing
if length(dataEntries) > 0
  dataEntry = dataEntries[1]
else
  warn("No data entries returned, you may want to add data before doing the get element call below...")
end

In the normal hexagonal example we added an image just for this purpose. You should see a single element listed if you're using that session.

Each data entry response contains the following information:

mutable struct BigDataEntryResponse
    id::String
    nodeId::Int
    sourceName::String
    description::String
    mimeType::String
    lastSavedTimestamp::String
    links::Dict{String, String}
end

Don't worry too much about sourceName for now (it really only features in our next release), but the other parameters are important:

Getting and Viewing Data Elements

Entries are distinct from elements, because we want you to be able to list data quickly (get summary entries), and then choose which data elements you want to retrieve. Each element could be ~100Mb, so our APIs are designed to let you cherry pick what to pull down the wire.

We can now get the element:

@show dataElem = getDataElement(synchronyConfig, robotId, sessionId, node, dataEntry)

This contains the same information as the entry, but there is now a data string property with the data in it. Generally we base64 encode this data to make sure it fits into a string datatype, and if you've retrieved the image from the Hexagonal example, it should just look like a bunch of ASCII.

If we want to skip getting all the entry information again, we can just call getRawDataElement, which returns a string:

@show dataElemRaw = getRawDataElement(synchronyConfig, robotId, sessionId, node dataEntry)

In the Hexagonal example, we base64 encoded an image and attached it to every pose. Note that at the moment if a big data element has an image MIME type, it's automatically base64 decoded whenever gerRawDataElement is called. If you're using that data set, we can visualize this image with the following snippet:

imgBytes = dataElemRaw

# Use the neat Images.jl, ImageView.jl, and ImageMagick.jl to show it
# In case you haven't added them:
Pkg.add("Images")
Pkg.add("ImageView")
using Images, ImageView, ImageMagick

# Now read the binary as an image and show
image = readblob(imgBytes)
imshow(image)

If you're wondering about the base64decode step, please take a look at the last section in this document.

Attaching, Updating, and Deleting Data Elements

Now that we've discussed getting data, it's pretty easy covering how to add/update/delete data elements.

Adding or Updating Data

To add data, just make a BigDataElementRequest (or use a helper to make one), and submit it.

Structures and JSON

We can encode structures as JSON, and send those (it's JSON - no base64 encoding required and they display nicely in the UI):

mutable struct TestStruct
  ints::Vector{Int}
  testString::String
  doubleNum::Float64
end

testStruct = TestStruct(1:10, "A test struct", 3.14159)
enc = JSON.json(testStruct)
request = BigDataElementRequest("Struct_Entry", "", "An example struct", enc)
@show structElement = addOrUpdateDataElement(synchronyConfig, robotId, sessionId, node, request)

Now we can retrieve it to see it again:

@show dataElemRaw = getRawDataElement(synchronyConfig, robotId, sessionId, node, "Struct_Entry")

Actually, we've ended up doing this so much we've made a simple helper method for it as well:

using SynchronySDK.DataHelpers
request = encodeJsonData("Struct_Entry", "An example struct", testStruct)
structElement = addOrUpdateDataElement(synchronyConfig, robotId, sessionId, node, request)

A Matrix

Similarly, we can construct a huge(ish) 2D matrix, encode it using JSON or ProtoBufs or JLD etc., and submit it. As above, let's encode it as JSON:

myMat = rand(100, 100);
dataBytes = JSON.json(myMat);
# Make a Data request
request = BigDataElementRequest("Matrix_Entry", "", "An example matrix", dataBytes, "application/json");
# Attach it to the node
addOrUpdateDataElement(synchronyConfig, robotId, sessionId, node, request);

Now we can retrieve it to see it again:

dataElemRaw = getRawDataElement(synchronyConfig, robotId, sessionId, node, "Matrix_Entry");
myMatDeser = JSON.parse(dataElemRaw);
# JSON matrices as deserialized as Put it back into a 2D matrix #TODO - There's probably an easier way to do this.
els = myMatDeser[1];
for i in 2:length(myMatDeser)
  append!(els, myMatDeser[i])
end
myMatDeser = reshape(els, (100,100));
myMatDeser

If we want to save it in a more compact, Julia-specific format, here is how we can use JLD (a Julia-specific HFDS format) and save it more compactly:

#TODO...

Either way, if you want to save it quickly using Base64 binary encoding, there's a simple helper method for this in SynchronySDK.DataHelpers:

using SynchronySDK.DataHelpers

request = encodeBinaryData("Matrix_Entry", "An example matrix", dataBytes)
# Attach it to the node
@show matrixElement = addOrUpdateDataElement(synchronyConfig, robotId, sessionId, node, request)

Images

Images can be sent as their raw encoded bytes with an image MIME type - they will be then displayable in your browser. We've made a helper to load files, which works well here:

request = DataHelpers.readFileIntoDataRequest(joinpath(Pkg.dir("SynchronySDK"), "examples", "pexels-photo-1004665.jpeg"), "TestImage", "Pretty neat public domain image", "image/jpeg");
imgElement = addOrUpdateDataElement(synchronyConfig, robotId, sessionId, node, request)

As above, blobs with the image/* datatypes are automatically base64 decoded before they are returned when using the getRawDataElement method. Let's retrieve it and use the Julia image libraries to show this:

using Images, ImageView, ImageMagick

# Read it, decode it, and make an image all in one line
# NOTE: Normally we have to do base64 decoding, but images are automatically decoded so that they can be shown in browser.
image = readblob(getRawDataElement(synchronyConfig, robotId, sessionId, node, "TestImage"));
# Show it
imshow(image)

Deleting Data

Deleting data is done by calling deleteDataElement. For example, we can delete the matrix and the struct elements we just added:


# Delete by element reference
@show deleteDataElement(synchronyConfig, robotId, sessionId, node, matrixElement)
# Delete by string key
@show deleteDataElement(synchronyConfig, robotId, sessionId, node, "Struct_Entry")

Discussion on Base64 Encoding and Decoding

A quick important point on encoding - it's not strictly required, but we recommend you base64 encode your data and the decode it when you retrieve it if it may contain special characters. That way, if there are non-ASCII characters, they won't be an issue. A little more data has to travel up and down the wire, but it's more robust overall.

@show unsafeString = "This is an unsafe string...\n\0"
@show encBytes = base64encode(unsafeString)
@show decBytes = base64decode(enc)
@show unsafeReturned = String(decBytes)