GPT-2 Text Generation Deployment Strategy on AWS

Manav tidhan
4 min readMay 17, 2021

Architecture:

GPT2 model is deployed on the instance of Amazon Sagemaker and Lambda function to invoke the endpoint of SageMaker and API Gateway so that we can call it through Rest API

Design

Step 1: Subscribe the GPT2 from AWS Marketplace.

This step is quite easy, you need to search GPT2 In AWS Marketplace and then accept it

Once you subscribed the GPT2 Subscription, follow the next step.

  1. Go to SageMaker Dashboard and create notebook instance.

2. Open Jupyter in the instance and go to the “sagemaker examples” and use the below notebook file which is highlighted one in below picture

Later on review all the commands and run all the commands in the Notebook one by one.

Use the Kernel as Conda_Python3

Later on

Execute each steps up to below , once we deploy model to endpoint , we can start using it .

Please note here the endpoint name is

endpoint_name = “demo-gpt2-endpoint”

Once you finish with the test , don’t forget to run below command in Notebook , this command will delete model , this command is also mentioned in the last step.

Step 2: Use Azure function to invoke the Endpoint

  1. On the Lambda console, on the Functions page, choose Create function.
  2. For Function name, enter a name.
  3. For Runtime¸ choose your runtime.
  4. For Execution role¸ select Use an existing role.

Here you need to select a role or need to create a role which is AmazonSageMakerFullAccess and to include lambda in trust relationship so that Lambda can access the SageMaker

Code to deploy for Lambda function

# grab environment variables

ENDPOINT_NAME = os.environ[‘ENDPOINT_NAME’]

client = boto3.client(“sagemaker-runtime”)

def lambda_handler(event, context):

# print(“Received event: “ + json.dumps(event, indent=2))

# data = json.loads(json.dumps(event))

# payload = data[‘data’]

payload = json.dumps(event)

print(payload)

payload = ‘{“input”: “Machine learning is great for humanity. It helps”, “length”: 100,”repetition_penalty”: 10,”num_return_sequences”: 1}’

response = client.invoke_endpoint(EndpointName=ENDPOINT_NAME,

ContentType=’application/json’,

Body=payload)

return response[“Body”].read()

By default the timeout of the Azure function is 3 Sec , you may need to increase it if you are using low size instance , In my case I used ml.t2.medium , which give response around 27seconds, you can adjust the instance size as your need.

ENDPOINT_NAME is an environment variable that holds the name of the SageMaker model endpoint you just deployed using the sample notebook. Replace the value with “demo-gpt2-endpoint” the endpoint name you created, if it’s different.

Step 3: Create Rest API.

You can create an API by following these steps:

On the API Gateway console, choose the REST API

Choose Build.

Select New API.

For API name¸ enter a name (for example, GPT2_Text_Generator).

Leave Endpoint Type as Regional.

Choose Create API.

On the Actions menu, choose Create resource.

Enter a name for the resource (for example, gettext).

After the resource is created, on the Actions menu, choose Create Method to create a POST method.

For Integration type, select Lambda Function.

For Lambda function, enter the function you created.

When the setup is complete, you can deploy the API to a stage.

On the Actions menu, choose Deploy API.

Create a new stage called stage.

Choose Deploy.

This step gives you the invoke URL.

Step 5 : Test

Execute the below command command from your machine

curl — location — request POST ‘https://example.amazonaws.com/stage/gettext' \

— header ‘Content-Type: application/json’ \

— data-raw ‘{

“input”: “Machine learning is great for humanity. It helps”,

“length”: 50,

“repetition_penalty”: 10,

“num_return_sequences”: 1

}’

Below is the sample output , when query using Postman

Step 6: Clean-Up

Don’t forget to delete the model and endpoint, once you finished the test here run below command from SageMaker Notebook

And reverify it by going to inference/models and inference/endpoints in SageMaker Dashboard , if it’s still there delete it from there as well and stop/delete your notebook instance . Later on you can unsubscribe the GPT2 from MarketPlace

--

--