Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -127,3 +127,6 @@ dmypy.json

# Pyre type checker
.pyre/

# Misc / user
.history
10 changes: 4 additions & 6 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# This is a potassium-standard dockerfile, compatible with Banana

# Must use a Cuda version 11+
FROM pytorch/pytorch:1.11.0-cuda11.3-cudnn8-runtime

Expand All @@ -11,18 +13,14 @@ RUN pip3 install --upgrade pip
ADD requirements.txt requirements.txt
RUN pip3 install -r requirements.txt

# We add the banana boilerplate here
ADD server.py .

# Add your model weight files
# (in this case we have a python script)
ADD download.py .
RUN python3 download.py


# Add your custom app code, init() and inference()
ADD app.py .
ADD . .

EXPOSE 8000

CMD python3 -u server.py
CMD python3 -u app.py
34 changes: 14 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,14 @@

# 🍌 Banana Serverless

This repo gives a framework to serve ML models in production using simple HTTP servers.

# Quickstart
**[Follow the quickstart guide in Banana's documentation to use this repo](https://docs.banana.dev/banana-docs/quickstart).**

*(choose "GitHub Repository" deployment method)*

<br>

# Helpful Links
Understand the 🍌 [Serverless framework](https://docs.banana.dev/banana-docs/core-concepts/inference-server/serverless-framework) and functionality of each file within it.

Generalize this framework to [deploy anything on Banana](https://docs.banana.dev/banana-docs/resources/how-to-serve-anything-on-banana).

<br>

## Use Banana for scale.
# My Potassium App
This is a Potassium HTTP server, created with `banana init` CLI

### Testing
Start a local dev server with `banana dev`

### Deployment
1. Create empty repo on [Github](https://github.com)
2. Push this repo to github
```
git remote add origin https://github.com/{username}/{repo-name}.git
```
3. [Log into Banana](https://app.banana.dev/onboard)
4. Select this repo to build and deploy!
55 changes: 33 additions & 22 deletions app.py
Original file line number Diff line number Diff line change
@@ -1,26 +1,37 @@
from transformers import pipeline
import torch
from potassium import Potassium, Request, Response

# Init is ran on server startup
# Load your model to GPU as a global variable here using the variable name "model"
from sentence_transformers import SentenceTransformer
from sklearn.preprocessing import normalize

app = Potassium("my_app")

# @app.init runs at startup, and loads models into the app's context
@app.init
def init():
global model

device = 0 if torch.cuda.is_available() else -1
model = pipeline('fill-mask', model='bert-base-uncased', device=device)

# Inference is ran for every server call
# Reference your preloaded global model variable here.
def inference(model_inputs:dict) -> dict:
global model

# Parse out your arguments
prompt = model_inputs.get('prompt', None)
if prompt == None:
return {'message': "No prompt provided"}

model = SentenceTransformer("sentence-transformers/paraphrase-mpnet-base-v2")

context = {
"model": model
}

return context

# @app.handler runs for every call
@app.handler()
def handler(context: dict, request: Request) -> Response:
prompt = request.json.get("prompt")
model = context.get("model")
# Run the model
result = model(prompt)
sentence_embeddings = model.encode(prompt)
normalized_embeddings = normalize(sentence_embeddings)

# Convert the output array to a list
output = normalized_embeddings.tolist()

return Response(
json = {"data": output},
status=200
)

# Return the results as a dictionary
return result
if __name__ == "__main__":
app.serve()
24 changes: 24 additions & 0 deletions banana_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
{
"name": "",
"category": "",
"example_input": {
"prompt": "Hello I am a [MASK] model."
},
"example_output": {
"outputs":[
{
"score":0.13177461922168732,
"token":4827,
"token_str":"fashion",
"sequence":"hello i am a fashion model."
},
{
"score":0.1120428815484047,
"token":2535,
"token_str":"role",
"sequence":"hello i am a role model."
}
]
},
"version": "1"
}
4 changes: 2 additions & 2 deletions download.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@

# In this example: A Huggingface BERT model

from transformers import pipeline
from sentence_transformers import SentenceTransformer

def download_model():
# do a dry run of loading the huggingface model, which will download weights
pipeline('fill-mask', model='bert-base-uncased')
SentenceTransformer('sentence-transformers/paraphrase-mpnet-base-v2')

if __name__ == "__main__":
download_model()
4 changes: 2 additions & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
sanic==22.6.2
transformers
potassium
sentence-transformers==2.2.2
accelerate
42 changes: 0 additions & 42 deletions server.py

This file was deleted.

10 changes: 0 additions & 10 deletions test.py

This file was deleted.