BatchJobService

52
BatchJobService A new approach to batch mutates

Transcript of BatchJobService

Page 1: BatchJobService

BatchJobServiceA new approach to batch mutates

Page 2: BatchJobService

Agenda

● Sync vs. Async

● Batch job flow

● Temporary IDs

● Incremental Uploads

● Best practices

● Limitations

Page 3: BatchJobService

Synchronousvs.

Asynchronous

Page 4: BatchJobService

Synchronous Requests

API Services Backend

Request

Response

RPC

RPC return

Sends operation

OK: receives status of the operation

Page 5: BatchJobService

Asynchronous Requests

API Async Services Backend

Request

Response

RPC

RPC return

OperationsOperations

Op. return

Request

Sends all Operations

OK: operations will be processed

Check operations status

Server sends the results

Operations enter an execution queue, and release the request.

Response

Op. return

Page 6: BatchJobService

Async Processing Isn't Free

Asynchronous processes usually incur additional overhead, such as:● Scheduling

● Management/load-balancing of the job queue

● Retry of failed operations

Page 7: BatchJobService

Synchronous vs. Asynchronous

Use Case Sync Async

Highly responsive user interface

Nightly bulk operations job

Fire-and-forget

Maximum throughput

Page 8: BatchJobService

AdWords API Batch Job Flow

Page 9: BatchJobService

Batch Processing - Why Use It?

● Automatic retry of operations that encounter transient errors such as:○ CONCURRENT_MODIFICATION○ UNEXPECTED_INTERNAL_API_ERROR

● Automatic retry of operations that encounter rate limits

● Your application won't be blocked waiting for a response.

Page 10: BatchJobService

Limitations of MutateJobService

● Error handling is difficult○ Difficult to find the operation that failed

● Can't create dependent entities in a single job

● MutateJobService has a much lower limit on the number of operations (in the thousands)

Page 11: BatchJobService

Introducing BatchJobService

● Allows setting temporary IDs on new entities for dependent entity creation

● Each result contains an operation index for correlating the result to its operation

● Much higher operations limit of 1 GB of unprocessed operations

● Utilities are available in all client libraries

Page 12: BatchJobService

Farewell, MutateJobService

● BatchJobService is a replacement for MutateJobService

● MutateJobService is going away

Page 13: BatchJobService

Supported OperationsAdGroupAdLabelOperationAdGroupAdOperationAdGroupBidModifierOperationAdGroupCriterionLabelOperationAdGroupCriterionOperationAdGroupLabelOperationAdGroupOperationBudgetOperationCampaignCriterionOperationCampaignLabelOperationCampaignOperationFeedItemOperation

AdGroupAdService

AdGroupService

BudgetService

AdGroupBidModifierService

CampaignCriterionService

FeedItemService

CampaignService

AdGroupCriterionService

Page 14: BatchJobService

Basic Steps

Download the results of the job

Upload the list of operations

Poll the batch job's status periodically

Operations

Results

2

3

4

Job

status

downloadUrl

Create a BatchJob

BatchJobService

Job

1

uploadUrl

Page 15: BatchJobService

Creating the Batch Job

● Create the BatchJob objectBatchJobOperation addOp = new BatchJobOperation();addOp.setOperator(Operator. ADD);addOp.setOperand(new BatchJob());

BatchJob batchJob = batchJobService.mutate(new BatchJobOperation[]{ addOp} ).getValue

(0);

● Make sure to grab the uploadUrl// Get the upload URL from the new job.String uploadUrl = batchJob.getUploadUrl().getUrl();

Page 16: BatchJobService

Creating the Batch Job (Cont.)

● Job status will be AWAITING_FILESystem.out.printf(

"Created BatchJob with ID %d, status '%s' and upload URL %s.%n" , batchJob.getId(),batchJob.getStatus(),uploadUrl);

$ Created BatchJob with ID <ID>, status 'AWAITING_FILE' and upload URL <URL>.

● Creating the BatchJob is a synchronous operation!

Page 17: BatchJobService

Campaign campaign = new Campaign();campaign.setName("Batch Campaign " + r.nextInt());campaign.setStatus(CampaignStatus. PAUSED);campaign.setAdvertisingChannelType(AdvertisingChannelType. SEARCH);Budget budget = new Budget();campaign.setBudget(budget);

BiddingStrategyConfiguration biddingStrategyConfiguration =

new BiddingStrategyConfiguration(); // Fill the biddingStrategyConfiguration

campaign.setBiddingStrategyConfiguration(biddingStrategyConfiguration );

CampaignOperation operation = new CampaignOperation();operation.setOperand(campaign);operation.setOperator(Operator. ADD);operations.add(operation); // Add to the list of operations

Uploading the Operations

● First, construct operations like usual

Page 18: BatchJobService

Uploading the Operations (Cont.)

● The next step is to send operations to the uploadUrl

● uploadUrl is valid for one week

Request method POST

URL Upload URL returned by BatchJobService.mutate

Content-Type HTTP header application/xml

Request body A mutate element in XML form, as specified in BatchJobOps.xsd

Page 19: BatchJobService

Uploading the Operations (Cont.)

● All libraries have utilities to help building the request!

● BatchJobHelper for Java client library:

// Use a BatchJobHelper to upload all operations.BatchJobHelper batchJobHelper = new BatchJobHelper(session);

batchJobHelper.uploadBatchJobOperations( operations, uploadUrl);System.out.printf(

"Uploaded %d operations for batch job with ID %d.%n" , operations.size(), batchJob.getId());

Page 20: BatchJobService

Polling the Job Status

● After uploading operations, the BatchJob status will move to ACTIVE.

AWAITING_FILE

ACTIVE

CANCELED

DONE

Page 21: BatchJobService

Polling the Job Status (Cont.)● Check the job status

int pollAttempts = 0;boolean isPending = true;Selector selector =

new SelectorBuilder() .fields(

BatchJobField.Id, BatchJobField.Status,BatchJobField.DownloadUrl, BatchJobField.ProcessingErrors, BatchJobField.ProgressStats)

.equalsId(batchJob.getId()) .build();do {

long sleepSeconds = (long) Math.scalb(30, pollAttempts); System.out.printf("Sleeping %d seconds...%n", sleepSeconds); Thread.sleep(sleepSeconds * 1000); batchJob = batchJobService.get(selector).getEntries(0); System.out.printf( "Batch job ID %d has status '%s'.%n",

batchJob.getId(), batchJob.getStatus());

pollAttempts++; isPending = PENDING_STATUSES.contains(batchJob.getStatus());} while (isPending && pollAttempts < MAX_POLL_ATTEMPTS);

Page 22: BatchJobService

Polling the Job Status (Cont.)

● Poll for completion of the batch job with an exponential back off

● Check the job's status until it is CANCELED or DONE

● Log progressStats for large sets of operations

Page 23: BatchJobService

Downloading Results

● At this stage the job status can be DONE or CANCELED

● Let's check out the details...

Page 24: BatchJobService

Downloading Results - DONE

● Status: DONE

● Description: BatchJobService successfully parsed and attempted each of the uploaded operations

● Actions to take: Download the results for each operation from the batch job's downloadUrl

Page 25: BatchJobService

Downloading Results - CANCELED

● Status: CANCELED

● Description: An unexpected error occurred, or BatchJobService could not parse the uploaded operations

● Actions to take: ○ Inspect the list of processingErrors○ Download the results for any successfully parsed

operations from the batch job's downloadUrl

RARE

Page 26: BatchJobService

Downloading Results - CANCELED

● Some operations may still have been attempted

● Always check the downloadUrl for the results

● Details about the mutateResult object:○ https://goo.gl/pXAhBS

Page 27: BatchJobService

Downloading Results (Cont.)● downloadUrl returns mutateResults● There is another utility to help!

if (batchJob.getDownloadUrl() != null && batchJob.getDownloadUrl().getUrl() != null) {

BatchJobMutateResponse mutateResponse =batchJobUploadHelper.downloadBatchJobMutateResponse(

batchJob.getDownloadUrl().getUrl());System.out.printf("Downloaded results from %s:%n" ,

batchJob.getDownloadUrl().getUrl());

for (MutateResult mutateResult :mutateResponse.getMutateResults()) {

String outcome = mutateResult.getErrorList() == null ? "SUCCESS" : "FAILURE";

System.out.printf(" Operation [%d] - %s\n" ,mutateResult.getIndex(), outcome);

}}

Page 28: BatchJobService

Downloading Results (Cont.)

● A mutateResult will have either a result or an errorList, but not both○ result - The result of the corresponding

successful operation, e.g., a Campaign object○ errorList - The error list for the corresponding

failed operation○ index - The zero-based index of the

corresponding operation MutateResult● result Operand● errorList ErrorList● index long

Page 29: BatchJobService

Errors

BatchJob.processingErrors (per job)● Errors encountered while parsing uploaded

operations - FILE_FORMAT_ERROR● Unexpected errors such as issues writing

resultsMutateResult.errorList (per operation)● Retrieved from BatchJob.downloadUrl● Errors encountered while attempting to

execute a single successfully parsed operation.

Page 30: BatchJobService

Temporary IDs

Page 31: BatchJobService

Temporary IDs

● Did you ever want to create a complete campaign in a single batch job?

● New feature introduced with BatchJobService○ A negative number of type long○ Just create an ID and reuse it in

operations for dependent objects

Page 32: BatchJobService

Common Use CaseCreate a Campaign in a single BatchJob:

Campaign● id = -1

AdGroup● id = -2● campaignId = -1

CampaignLabel● campaignId = -1

NegativeCampaignCriterion● campaignId = -1

AdGroupAd● adGroupId = -2

BiddableAdGroupCriterion● adGroupId = -2● criterion

○ text = "shoes"

BiddableAdGroupCriterion● adGroupId = -2● criterion

○ text = "jackets"

...

Page 33: BatchJobService

Important Notes

● The order of operations is very important!

● Keep this in mind when using temp IDs

● Create the parent object before creating its child objects

Page 34: BatchJobService

Incremental Uploads

Page 35: BatchJobService

Introduction

● Allows multiple operation upload requests to the same BatchJob

● Job will start after last set of operations is sent out

● Follows the Google Cloud storage guidelines

Page 36: BatchJobService

Incremental Upload Use Cases

● You build your BatchJob operations in phases or separate processes, so sending all operations at once isn't feasible

● You have a large set of operations and you don't want to send them all in one enormous upload request○ E.g.: you may not want to send 500 MB of

operations in a single POST request

Page 37: BatchJobService

Range of bytes in the request, followed by total bytes. Total bytes will be * for the first and intermediate requests.

The number of bytes in the contents of the current request.

application/xmlContent-TypeHTTP Header

Request AttributesRequest method PUT

URL

Content-LengthHTTP Header

Content-RangeHTTP Range

Request body

Upload URL returned by BatchJobService.mutate

Operations in XML form, as specified in BatchJobOps.xsd.

Page 38: BatchJobService

More on the Request Body...

● BatchJobService will concatenate all the requests

● You just need to send the first and last markers:

Request Start mutate element End mutate element

First

Intermediate

Last

Page 39: BatchJobService

More on the Request Body (Cont.)

● All requests will be parsed as a single document

● The concatenation of all requests has to be a complete XML document

● The size of each request body must be a multiple of 256K (262144) bytes○ This rule does not apply to the last request

Page 40: BatchJobService

Request 1

Content-Range: 0-262143/*<?xml version="1.0" encoding="UTF-8"?>

<ns1:mutate xmlns:ns1="https://adwords.google.com/api/adwords/cm/v201509"><operations xsi:type="ns1:CampaignOperation">

<operator xsi:type="ns1:Operator">ADD</operator><operand xsi:type="ns1:Campaign">…

</operations><operations> …</operat

Content length of 262144, where the "t" in the last line is the 262144th byte.

Page 41: BatchJobService

Request 3

Content-Range: 524288-524304/524305rations></mutate>

● Content length without padding of 17 bytes, where the closing > on </mutate> is the 17th byte

● Total content length across all requests for the job is 262144+262144+17 = 524305 bytes

Page 42: BatchJobService

Request 2

Content-Range: 262144-524287/*ions>

<operations xsi:type="ns1:AdGroupOperation"><operator xsi:type="ns1:Operator">ADD</operator><operand xsi:type="ns1:AdGroup">...

</operations><operations>...

</ope

Content length of 262144, where the "e" in the last line is the 262144th byte.

Page 43: BatchJobService

Use the Client Libraries!

● The client libraries have utilities to do all the parsing

● No need to worry about size details

● Check out the online examples○ https://goo.gl/wgywm1

Page 44: BatchJobService

Best PracticesGeneral Guidelines

Page 45: BatchJobService

Improve Throughput

● Fewer larger jobs over many smaller jobs

● Exponential back off when polling

● Don't poll job status too frequently○ Might hit a rate limit

Page 46: BatchJobService

Dealing with Same Client ID

● Avoid different jobs working on the same objects○ Might result in deadlocks, followed by execution

failures

● Wait for the job to be DONE or CANCELED

Page 47: BatchJobService

One Last Tip...

● Avoid multiple mutates of the same object in the same job

Page 48: BatchJobService

Limitations

Page 49: BatchJobService

Regarding Operations Size

● 1 Gb of unfinished operations per account at any given time○ Will throw DISK_QUOTA_EXCEEDED error○ Operations from incremental uploads where the

last upload has not occurred do not count towards this limit

● Just wait for some jobs to complete and try again

Page 50: BatchJobService

Regarding Shopping Campaigns

● BJS similar to partialFailure = true

● Partial failure not supported for Shopping campaign ad group criteria...

● ...so BJS does not support AdGroupCriterionOperations on ad groups of shopping campaigns○ Will result in a CAMPAIGN_TYPE_NOT_COMPATIBLE_WITH_PARTIAL_FAILURE error

Page 51: BatchJobService

Cancelling the Job

BatchJob status is read-only:

● You can't cancel a job before the operations upload is finished

● You can't cancel a job while it's executing