GPT-2 Text Generation Deployment Strategy on AWS
Architecture:
GPT2 model is deployed on the instance of Amazon Sagemaker and Lambda function to invoke the endpoint of SageMaker and API Gateway so that we can call it through Rest API
Step 1: Subscribe the GPT2 from AWS Marketplace.
This step is quite easy, you need to search GPT2 In AWS Marketplace and then accept it
Once you subscribed the GPT2 Subscription, follow the next step.
- Go to SageMaker Dashboard and create notebook instance.
2. Open Jupyter in the instance and go to the “sagemaker examples” and use the below notebook file which is highlighted one in below picture
Later on review all the commands and run all the commands in the Notebook one by one.
Use the Kernel as Conda_Python3
Later on
Execute each steps up to below , once we deploy model to endpoint , we can start using it .
Please note here the endpoint name is
endpoint_name = “demo-gpt2-endpoint”
Once you finish with the test , don’t forget to run below command in Notebook , this command will delete model , this command is also mentioned in the last step.
Step 2: Use Azure function to invoke the Endpoint
- On the Lambda console, on the Functions page, choose Create function.
- For Function name, enter a name.
- For Runtime¸ choose your runtime.
- For Execution role¸ select Use an existing role.
Here you need to select a role or need to create a role which is AmazonSageMakerFullAccess and to include lambda in trust relationship so that Lambda can access the SageMaker
Code to deploy for Lambda function
# grab environment variables
ENDPOINT_NAME = os.environ[‘ENDPOINT_NAME’]
client = boto3.client(“sagemaker-runtime”)
def lambda_handler(event, context):
# print(“Received event: “ + json.dumps(event, indent=2))
# data = json.loads(json.dumps(event))
# payload = data[‘data’]
payload = json.dumps(event)
print(payload)
payload = ‘{“input”: “Machine learning is great for humanity. It helps”, “length”: 100,”repetition_penalty”: 10,”num_return_sequences”: 1}’
response = client.invoke_endpoint(EndpointName=ENDPOINT_NAME,
ContentType=’application/json’,
Body=payload)
return response[“Body”].read()
By default the timeout of the Azure function is 3 Sec , you may need to increase it if you are using low size instance , In my case I used ml.t2.medium , which give response around 27seconds, you can adjust the instance size as your need.
ENDPOINT_NAME is an environment variable that holds the name of the SageMaker model endpoint you just deployed using the sample notebook. Replace the value with “demo-gpt2-endpoint” the endpoint name you created, if it’s different.
Step 3: Create Rest API.
You can create an API by following these steps:
On the API Gateway console, choose the REST API
Choose Build.
Select New API.
For API name¸ enter a name (for example, GPT2_Text_Generator).
Leave Endpoint Type as Regional.
Choose Create API.
On the Actions menu, choose Create resource.
Enter a name for the resource (for example, gettext).
After the resource is created, on the Actions menu, choose Create Method to create a POST method.
For Integration type, select Lambda Function.
For Lambda function, enter the function you created.
When the setup is complete, you can deploy the API to a stage.
On the Actions menu, choose Deploy API.
Create a new stage called stage.
Choose Deploy.
This step gives you the invoke URL.
Step 5 : Test
Execute the below command command from your machine
curl — location — request POST ‘https://example.amazonaws.com/stage/gettext' \
— header ‘Content-Type: application/json’ \
— data-raw ‘{
“input”: “Machine learning is great for humanity. It helps”,
“length”: 50,
“repetition_penalty”: 10,
“num_return_sequences”: 1
}’
Below is the sample output , when query using Postman
Step 6: Clean-Up
Don’t forget to delete the model and endpoint, once you finished the test here run below command from SageMaker Notebook
And reverify it by going to inference/models and inference/endpoints in SageMaker Dashboard , if it’s still there delete it from there as well and stop/delete your notebook instance . Later on you can unsubscribe the GPT2 from MarketPlace