Working with Documents

Overview

Today’s applications are required to be highly responsive and always online. To achieve low latency and high availability, instances of these applications need to be deployed in datacenters that are close to their users. These applications are typically deployed in multiple datacenters and are called globally distributed.

Globally distributed applications need a geo distributed fast data platform that can transparently replicate the data anywhere in the world to enable the applications to operate on a copy of the data that's close to its users. Similarly the applications need geo-replicated and local streams to handle pub-sub, ETL and real-time updates from the fast data platform.

Macrometa GDN is a fully managed geo-distributed low latency data service with turnkey global distribution and transparent multi-master replication. You can run globally distributed, low-latency workloads within GDN.

This article is an introduction to working with documents in GDN with pyC8 and jsC8 drivers.

In the drivers, a document is a dictionary/object that is JSON serializable with the following properties:

  • Contains the _key field, which identifies the document uniquely within a specific collection.
  • Contains the _id field (also called the handle), which identifies the document uniquely across all collections within a fabric. This ID is a combination of the collection name and the document key using the format {collection}/{key} (see example below).
  • Contains the _rev field. GDN supports MVCC (Multiple Version Concurrency Control) and is capable of storing each document in multiple revisions. Latest revision of a document is indicated by this field. The field is populated by GDN and is not required as input unless you want to validate a document against its current revision.

Here is an example of a valid document:

    {
        '_id': 'students/bruce',
        '_key': 'bruce',
        '_rev': '_Wm3dzEi--_',
        'first_name': 'Bruce',
        'last_name': 'Wayne',
        'address': {
            'street' : '1007 Mountain Dr.',
            'city': 'Gotham',
            'state': 'NJ'
        },
        'is_rich': True,
        'friends': ['robin', 'gordon']
    }

Edge documents (edges) are similar to standard documents but with two additional required fields _from and _to. Values of these fields must be the handles of "from" and "to" vertex documents linked by the edge document in question. Here is an example of a valid edge document:

    {
        '_id': 'friends/001',
        '_key': '001',
        '_rev': '_Wm3dyle--_',
        '_from': 'students/john',
        '_to': 'students/jane',
        'closeness': 9.5
    }

Note

If you are new to Macrometa GDN, we strongly recommend reading Essentials of Macrometa GDN.

Pre-requisite

Let's assume your

  • Tenant name is guest@macrometa.io and
  • User password is guest.

Driver download


pyC8 requires Python 3.5+. Python 3.6 or higher is recommended

To install pyC8, simply run

    $ pip3 install pyC8

or, if you prefer to use conda:

    conda install -c conda-forge pyC8

or pipenv:

    pipenv install --pre pyC8

Once the installation process is finished, you can begin developing applications in Python.

With Yarn or NPM

    yarn add jsc8
    (or)
    npm install jsc8

If you want to use the driver outside of the current directory, you can also install it globally using the `--global` flag:

    npm install --global jsc8

From source,

    git clone https://github.com/macrometacorp/jsc8.git
    cd jsC8
    npm install
    npm run dist

Connect to GDN

The first step in using GDN is to establish a connection to a region. When this code executes, it initializes the server connection to the **closest* region to your location.

from c8 import C8Client

print("Connect to C8...")
client = C8Client(protocol='https', host='gdn1.macrometa.io', port=443)
const jsc8 = require("jsc8")
const client = new jsc8("https://gdn1.macrometa.io"); 

Get GeoFabric Details

To get details of fabric,

from c8 import C8Client
client = C8Client(protocol='https', host='gdn1.macrometa.io', port=443)

demotenant = client.tenant(email='guest@macrometa.io', password='guest')
print("Get geo fabric...")
fabric = demotenant.useFabric('_system')
print("Get geo fabric details...")
print(fabric.fabrics_detail())
const jsc8 = require("jsc8")

const client = new jsc8("https://gdn1.macrometa.io");

async function getFabric() {
    await console.log("Logging in...");
    await client.login("guest@macrometa.io", "guest");
    await console.log("Using the guest tenant...");  
    client.useTenant("guest");

    try{
      await console.log("Using the demoFabric...");  
      client.useFabric("_system")

      await console.log("Getting the fabric details...");
      let result = await client.get();

      await console.log("result is: ", result)
    } catch(e){
      await console.log("Fabric details could not be fetched because "+ e)
    }
}

getFabric();

Create Collection

We can now create collection in the fabric. To do this, first you connect to fabric and then create a collection called employees.

The below example shows the steps.

from c8 import C8Client
client = C8Client(protocol='https', host='gdn1.macrometa.io', port=443)

demotenant = client.tenant(email='guest@macrometa.io', password='guest')
print("Get geo fabric...")
fabric = demotenant.useFabric('_system')
employees = fabric.create_collection('employees') # Create a new collection named "employees".
const jsc8 = require("jsc8")
const client = new jsc8("https://gdn1.macrometa.io")

async function createCollection() {
  await console.log("Logging in...");
  await client.login("guest@macrometa.io", "guest");

  await console.log("Using the guest tenant");
  client.useTenant("guest");

  await console.log("Using the demoFabric...");
  client.useFabric("_system");

  await console.log("Creating the collection object to be used...");
  let collection = client.collection('employees');

  await console.log("Creating the collection employees under demoFabric...");
  let collectionDetails;
  try{
    collectionDetails = await collection.create(); 
    await console.log("The collection details are: ", collectionDetails);
  } catch(e){
    return "Collection creation did not succeed due to " + e
  }

  return "Collection " + collectionDetails.name + " created successfully"  
}

createCollection().then(console.log)

Create Index

Let's add a hash_index called emails to our collection employees. Please refer to reference guide for details on other available index types.

from c8 import C8Client
client = C8Client(protocol='https', host='gdn1.macrometa.io', port=443)

demotenant = client.tenant(email='guest@macrometa.io', password='guest')
print("Get geo fabric...")
fabric = demotenant.useFabric('_system')
employees = fabric.collection('employees') # Create a new collection named "employees".
employees.add_hash_index(fields=['email', '_key'], unique=True) # Add a hash index to the collection.
print("Index created successfully")
const jsc8 = require("jsc8")
const client = new jsc8("https://gdn1.macrometa.io")

async function createIndex() {
  await console.log("Logging in...");
  await client.login("guest@macrometa.io", "guest");

  await console.log("Using the demotenant");
  client.useTenant("guest");

  await console.log("Using the Fabric...");
  client.useFabric("_system");

  await console.log("Creating the collection object to be used...");
  let collection = client.collection('employees');

  await console.log("Creating the index on collection employees under demoFabric...");
  let index;
  try{
    index = await collection.createIndex({type: 'hash', fields: ['email', '_key']}); 
    await console.log("The index details are: ", index);
  } catch(e){
    return "Index creation did not succeed due to " + e
  }

  return "Index created successfully"  
}

createIndex().then(console.log)

Insert Documents

Let's insert documents to the employees collection as shown below.

from c8 import C8Client
client = C8Client(protocol='https', host='gdn1.macrometa.io', port=443)
demotenant = client.tenant(email='guest@macrometa.io', password='guest')

print("Get geo fabric...")
fabric = demotenant.useFabric('_system')
employees = fabric.collection('employees') # get collection named "employees".

# insert documents into the collection
employees.insert({'_key':'Jean', 'firstname': 'Jean', 'lastname':'Picard', 'email':'jean.picard@macrometa.io'})
employees.insert({'_kefabricy':'James', 'firstname': 'James', 'lastname':'Kirk', 'email':'james.kirk@mafabriccrometa.io'})
employees.insert({'_kefabricy': 'Han', 'firstname': 'Han', 'lastname':'Solo', 'email':'han.solo@macrfabricometa.io'})
employees.insert({'_kefabricy': 'Bruce', 'firstname': 'Bruce', 'lastname':'Wayne', 'email':'bruce.wayne@mfabricacrometa.io'})
const jsc8 = require("jsc8")
const client = new jsc8("https://gdn1.macrometa.io")


const docJean = {'_key':'Jean', 
           'firstname': 'Jean', 
           'lastname':'Picard', 'email':'jean.picard@macrometa.io'}

const docJames = {'_key':'James', 
                  'firstname': 'James', 'lastname':'Kirk', 'email':'james.kirk@macrometa.io'}

const docHan = {'_key': 'Han', 
                'firstname': 'Han',
                'lastname':'Solo', 'email':'han.solo@macrometa.io'}

const docBruce = {'_key': 'Bruce',
                  'firstname': 'Bruce', 'lastname':'Wayne', 'email':'bruce.wayne@macrometa.io'}

const docs = [docJean, docJames, docHan, docBruce]


async function populate() {
  await console.log("Logging in...");
  await client.login("guest@macrometa.io", "guest");

  await console.log("Using the guest tenant");
  client.useTenant("guest");

  await console.log("Using the Fabric...");
  client.useFabric("_system");

  await console.log("Creating the collection object to be used...");
  let collection = client.collection('employees');

  for (let doc of docs) {
    await collection.save(doc)
  }
  await console.log("collection populated with documents")
}

populate()

Query documents using C8QL

C8QL is C8's query language. You can execute C8QL query on our newly created collection employees to get its contents.

The query FOR employee IN employees RETURN employee is equivalent to SQL's SELECT query.

from c8 import C8Client

print("Create connection...")
client = C8Client(protocol='https', host='gdn1.macrometa.io', port=443)

print("Get geo fabric...")
demotenant = client.tenant(email='guest@macrometa.io', password='guest')

print("Get geo fabric...")
fabric = demotenant.useFabric('_system')

print("Execute C8QL query...")
cursor = fabric.c8ql.execute('FOR employee IN employees RETURN employee') 

print("Retrieve documents...")
docs = [document for document in cursor]
print(docs)
const jsc8 = require("jsc8")
const c8ql = jsc8.c8ql
const client = new jsc8("https://gdn1.macrometa.io")

async function c8Queries() {
  await console.log("Logging in...");
  await client.login("guest@macrometa.io",  "guest");

  await console.log("Using the tenant");
  client.useTenant("guest");

  await console.log("Using the Fabric...");
  client.useFabric("_system");

  await console.log("Creating the collection object to be used...");
  let collection = client.collection('employees');

  const cursor = await client.query(c8ql`FOR employee IN employees RETURN employee`);
  const result = await cursor.all();
  await console.log(result)
}

c8Queries()

Get realtime updates

Example for real-time updates from a collection in fabric:

from c8 import C8Client
import warnings
warnings.filterwarnings("ignore")

def callback_fn(event):
    print(event)

print("Create connection...")
client = C8Client(protocol='https', host='gdn1.macrometa.io', port=443)

print("Get geo fabric...")
demotenant = client.tenant(email='guest@macrometa.io', password='guest')
print("Get geo fabric...")
fabric = demotenant.useFabric('_system')

print("Subscribe to receive updates on employees collection...")
fabric.on_change("employees", callback=callback_fn)
const jsc8 = require("jsc8")
const client = new jsc8("https://gdn1.macrometa.io")

async function callback_fn(collection){
  await console.log("Connection open on ", collection.name)
}

async function realTimeListener() {
  await console.log("Logging in...");
  await client.login("guest@macrometa.io", "guest");

  await console.log("Using the tenant");
  client.useTenant("guest");

  await console.log("Using the Fabric...");
  client.useFabric("_system");

  await console.log("Creating the collection object to be used...");
  let collection = client.collection('employees');

  collection.onChange({
      onmessage: (msg) => console.log("message=>", msg),
      onopen: () => {;
        callback_fn(collection)
      },
      onclose: () => console.log("connection closed")
    }, "gdn1.macrometa.io");
}

realTimeListener()

Spot Collections

Create a geo-fabric with spot region capabilities. Then create a collection that is designated as a spot collection. A geo-fabric can contain both regular and spot collections.

from c8 import C8Client
import json

spot_fabric_name = "spot-geo-fabric"
spot_collection_name = "spot-collection"

# Initialize the client for C8DB.
client = C8Client(protocol='https', host='gdn1.macrometa.io', port=443)

#Create a geo-fabric and pass one of the spot regions. You can use the SPOT_CREATION_TYPES for the same. If you use AUTOMATIC, a random spot region will be assigned by the system.

# If you specify None, a geo-fabric is created without the spot properties. If you specify spot region,pass the corresponding spot region in the spot_dc parameter.

print("Get tenant...")
tenant = client.tenant(email='guest@macrometa.io', password='guest')
print("Get geo fabric...")
fabric = tenant.useFabric('_system')
# fabric.delete_fabric(spot_fabric_name)
local_region = tenant.localdc()
fabric.create_fabric(spot_fabric_name,  dclist=tenant.dclist(), spot_creation_type= fabric.SPOT_CREATION_TYPES.SPOT_REGION, spot_dc=local_region["tags"]["url"].split(".")[0], users = [{"username": "guest", "password": "guest", "active": True}])

spot_fabric = tenant.useFabric(spot_fabric_name)
print("Create spot collection ...")
spot_collection = spot_fabric.create_collection(spot_collection_name, spot_collection=True)

print("Insert documents into spot collection ...")
spot_collection.insert({'firstname': 'Jean', 'lastname':'Picard', 'email':'jean.picard@macrometa.io'})
spot_collection.insert({'firstname': 'James', 'lastname':'Kirk', 'email':'james.kirk@macrometa.io'})
spot_collection.insert({'firstname': 'Han', 'lastname':'Solo', 'email':'han.solo@macrometa.io'})
spot_collection.insert({'firstname': 'Bruce', 'lastname':'Wayne', 'email':'bruce.wayne@macrometa.io'})

print("Execute C8QL query...")
cursor = spot_fabric.c8ql.execute('FOR doc IN @@collection RETURN doc', bind_vars={"@collection":spot_collection_name}) 

print("Retrieve documents...")
docs = [document for document in cursor]
print(docs)

print("clearing collection")
spot_fabric.delete_collection(spot_collection_name)
fabric = tenant.useFabric('_system')
fabric.delete_fabric(spot_fabric_name)
const jsc8 = require("jsc8")
const client = new jsc8("https://gdn1.macrometa.io")
const collection_name = "accounts"

async function spotCollection() {
  await console.log("Logging in...");
  await client.login("guest@macrometa.io", "guest");

  await console.log("Using the demotenant");
  client.useTenant("guest");

  await console.log("Using the demoFabric...");
  client.useFabric("_system");

  await console.log("Creating the collection object to be used...");
  try{
     console.log("Creating the collection shops...");
      collection_address = client.collection(collection_name)
      let exists_coll = await collection_address.exists()
      if (exists_coll === false) {
            await collection_address.create({isSpot: true})

        }

      }
    catch (e) {
    await console.log("Collection creation did not succeed due to " + e)
    }

}

spotCollection()

RESTQL

RESTQL enables developers to quickly convert saved C8QL queries into geo-distributed REST APIs. This eliminates the need for separate backend servers & containers for CRUD operations.

from c8 import C8Client

fed_url = "gdn1.macrometa.io"
guest_mail = "guest@macrometa.io"
guest_password = "guest"
geo_fabric = "_system"
collection_name = "person"

value = "INSERT {'firstname':@firstname, 'lastname':@lastname, 'email':@email, 'zipcode':@zipcode, '_key': 'abc'} IN %s" % collection_name
parameter = {"firstname": "", "lastname": "", "email": "", "zipcode": ""}

insert_data = {"query": {"name": "insertRecord", "parameter": parameter, "value": value}} 
get_data = {"query": {"name": "getRecords", "value": "FOR doc IN %s RETURN doc" % collection_name}}
update_data = {"query": {"name": "updateRecord", "value": "UPDATE 'abc' WITH { \"lastname\": \"cena\" } IN %s" % collection_name }}
delete_data= {"query": {"name": "deleteRecord", "value": "REMOVE 'abc' IN %s" % collection_name}}
get_count = {"query": {"name": "countRecords", "value": "RETURN COUNT(FOR doc IN %s RETURN 1)" % collection_name}}

if __name__ == '__main__':

    print("\n ------- CONNECTION SETUP  ------")
    print("tenant: {}, geofabric:{}".format(guest_mail, geo_fabric))
    client =  C8Client(protocol='https', host=fed_url, port=443)
    tenant = client.tenant(guest_mail, guest_password)
    fabric = tenant.useFabric(geo_fabric)

    print("Availabile regions....")
    dclist = fabric.dclist(detail=False)
    for dc in dclist:
        print("region: {}".format(dc))
    print("Connected to closest region...\tregion: {}".format(fabric.localdc(detail=False)))

    print("\n ------- CREATE GEO-REPLICATED COLLECTION  ------")
    employees = fabric.create_collection(collection_name)
    print("Created collection: {}".format(collection_name))

    print("\n ------- CREATE RESTQLs  ------")
    fabric.save_restql(insert_data)  # name: insertRecord
    fabric.save_restql(get_data)  # name: getRecords
    fabric.save_restql(update_data)  # name: updateRecord
    fabric.save_restql(delete_data)  # name: deleteRecord
    fabric.save_restql(get_count)  # name: countRecords
    print("Created RESTQLs:{}".format(fabric.get_all_restql()))

    print("\n ------- EXECUTE RESTQLs ------")
    print("Insert data....")
    response = fabric.execute_restql(
        "insertRecord",
        {"bindVars": {"firstname": "john", "lastname": "doe",
                      "email": "john.doe@macrometa.io", "zipcode": "511037"}})
    print("Get data....")
    response = fabric.execute_restql("getRecords")
    print("Update data....")
    response = fabric.execute_restql("updateRecord")
    print("Get data....")
    response = fabric.execute_restql("getRecords")
    print("Count records....")
    response = fabric.execute_restql("countRecords")
    print("Delete data....")
    response = fabric.execute_restql("deleteRecord")

    print("\n ------- DELETE RESTQLs ------")
    fabric.delete_restql("insertRecord")
    fabric.delete_restql("getRecords")
    fabric.delete_restql("updateRecord")
    fabric.delete_restql("countRecords")
    fabric.delete_restql("deleteRecord")

    print("\n ------- DONE  ------")

const jsc8 = require('jsc8')

//Variables
const client = new jsc8("https://gdn1.macrometa.io")
const guest_email = "guest@macrometa.io"
const guest_password = "guest"
const geo_fabric = "guest"
const collection_name = "addresses" + Math.floor(1000 + Math.random() * 9000).toString()

//Queries
const insert_data = "INSERT {'firstname':@firstname, 'lastname':@lastname, 'email':@email, 'zipcode':@zipcode, '_key': 'abc'} IN " + collection_name

const get_data = "FOR doc IN " + collection_name + " RETURN doc"

const update_data = "UPDATE 'abc' WITH {'lastname': @lastname } IN " + collection_name

const delete_data = "REMOVE 'abc' IN " + collection_name

const get_count = "RETURN COUNT(FOR doc IN " + collection_name + " RETURN 1)"


async function restqldemo() {
  await client.login(guest_email, guest_password);

  client.useFabric(geo_fabric);

  console.log("------- CREATE GEO-REPLICATED COLLECTION  ------")

  const collection = client.collection(collection_name);

  await collection.create()

  console.log("Collection " + collection_name + " created.\n")

  console.log("------- SAVING THE QUERIES  ------")

  await client.saveQuery("insertData", {}, insert_data)
  await client.saveQuery("getData", {}, get_data)
  await client.saveQuery("updateData", {}, update_data)
  await client.saveQuery("deleteData", {}, delete_data)
  await client.saveQuery("getCount", {}, get_count)

  console.log("Saved Queries Successfully\n")

  console.log("------- EXECUTING THE QUERIES  ------")

  const bindVars = {
    "firstname": "john", "lastname": "doe",
    "email": "john.doe@macrometa.io", "zipcode": "511037"
  }

  await client.executeSavedQuery("insertData", bindVars)

  console.log("Data Inserted \n")

  const res = await client.executeSavedQuery("getData")

  console.log("Output of get data query:")
  console.log(res.result)
  console.log("\n")

  await client.executeSavedQuery("updateData", { "lastname": "mathews" })

  console.log("Data updated \n")

  const data = await client.executeSavedQuery("getData")

  console.log("Output of get data query after update:")

  console.log(data.result)

  console.log("\n")

  const count = await client.executeSavedQuery("getCount")

  console.log("Count:")

  console.log(count.result)

  await client.executeSavedQuery("deleteData")
}

restqldemo().then(console.log("Starting Execution"))
Top