Working with Batch API in Sitecore CDP


In my previous blog, I provided a brief explanation about the topic introduction of Batch API in Sitecore CDP. You can read about it here.

Today I am going to explain the process to work with Batch API in Sitecore CDP.

Using the Batch API involves uploading one or more batch files into Sitecore CDP. Before you start uploading a batch file you must ensure that the file meets the formatting requirements.

A batch file is a gzipped file that must contain at least one JSON file. Its format must be of type (.json.gz). The batch API supports uploading only gzipped archive files (.gz). Suppose if the data that you want to upload in Sitecore CDP is in import.json file then you must gzip that JSON file to import.json.gz.

Formatting Requirements for Batch File

The JSON file must contain a separate JSON record for each entity that is to be imported.

Each JSON record within the JSON file must be a valid JSON.

Each JSON record within the JSON file must be minified, that is it must be contained on a single line.

Each JSON record must be terminated with a carriage return.

Each JSON record must be encoded according to RFC 4627.

Note: Since a single JSON file includes multiple JSON records this will make the file invalid which is normal. The important part is that each JSON record must be a valid JSON.

Examine the below table which describes what are the required fields each JSON record must include.


Some points to be remembered.

1. All the fields for each object are not required to be present during the upload process. The exception is for guest data extensions, where depending on the type of guest data extension, all the fields might be required.

2. While using upsert mode for a guest you only need to include enough data to identify the guest within Sitecore CDP as well as the fields that need to be updated.

3. When using guest mode for a guest you need to include all the fields required to make the entity valid because the update must result in a valid entity.

Using Postman to upload Batch files in Sitecore CDP

Let’s first create a JSON file with the required necessary details. I am creating an import.json file with the below JSON records.

{"ref":"CB8B4A20-9C46-43A9-8B64-D1414939612F","schema":"guest","mode":"upsert","value":{"guestType":"customer","firstName":"Jenny","lastName":"Jack","email":"jennyjack@test.com","identifiers":[{"provider":"email","id":"jennyjack@test.com"}]}}
{"ref":"B721DE88-7E5D-44FC-9320-93EA1A5CA159","schema":"guest","mode":"upsert","value":{"guestType":"customer","firstName":"Max","lastName":"Miller","email":"maxmiller@test.com","identifiers":[{"provider":"email","id":"maxmiller@test.com"}]}}


Before you upload a batch file, you must gzip your JSON file and use the gzipped file during the upload process. To gzip your JSON file you can use a file archiver like 7-Zip.


Note: There is a 50MB size limit for uploading batch files. If the size of the gzipped file exceeds the 50MB limit, then you must recompress the original JSON files into two or more gzipped files such that they do not exceed the 50MB size limit. Then, upload the gzipped files as separate batches.

After gzipping the JSON file keep note of the below details which you will be required while uploading the batch file.

A. File size in bytes - The size in bytes of the gzipped file (import.json.gz). You can check this in the file's properties.

B. MD5 checksum - A hex-encoded MD5 checksum for the gzipped file. This is to assure that the integrity of the gzipped file is intact. You can get the MD5 checksum by uploading the file here.

C. Base64 string - The hex-encoded MD5 checksum obtained from the above step is converted to Base64. You can get the Base64 string by entering the MD5 checksum here.

D. UUID for Batch reference - The UUID of a batch upload. This is a UUID that you generate. It must be unique across all batches. You can use online tools https://guidgenerator.com/online-guid-generator.aspx to generate a UUID.

Now open the Postman and create a PUT request with Authorization as “Basic Auth” where “Username” is Client Key and “Password” is API Token. The URL for the request will be <baseURL>/v2/batches/<BatchRefUUID>. Also, add a raw JSON in the body section having checksum as a string and size as an integer.

You can get the baseURL for your tenant by following the below table.


BatchRefUUID is a UUID that you generate. It must be unique across all batches.

{
 "checksum": "9a8bf8a3e7280ed3509f72824828d2ac",
 "size": 265
}


In the response within the location object, we will see the href key contains the batch upload path. We have to use this batch upload path in the next step. The batch upload path is valid for 1 hour.

Now make another PUT request and copy the batch upload path obtained in the last step to the request URL field. Also, we do not need Authorization for this request. We have to add two headers i.e., “x-amz-server-side-encryption” as the key with the value “AES256” and “Content-Md5” as the key with the value of Base64 string we obtained earlier. Also, select the gzipped file as binary in the body section.


This request returns an empty but 200 OK response.

Now we will make a GET request to check the status of the batch file upload. The URL for the request will be <baseURL>/v2/batches/<BatchRefUUID> with Authorization as “Basic Auth” where “Username” is Client Key and “Password” is API Token. We have to add a header i.e., “Accept” as a key with values as “application/json”.

Note: BatchRefUUID must be the same as we have used for the PUT request.


In the response, the status object contains information about the upload status. First, the status is processing. After the upload succeeds, the status is set as "success".


After the upload succeeds you can log in to Sitecore CDP to find the uploaded data. Here we can search for the guests listed in our sample JSON records.


We have successfully ingested the data to Sitecore CDP with Batch APIs.

References




That’s all for Today,
Happy Coding
Chirag Goel

I am a developer, likes to work on different future technologies.

Post a Comment (0)
Previous Post Next Post