An Open Source Stack for Managing and Deploying Models

shcheklein · on Nov 22, 2019

A technical question - how does your system work in the IRIS or other examples you have? It looks like it hides under the hood a few things compared to this implementation - type of the model (tensorflow) and the logic that fetches it from S3, right? Let's imagine we would want to make a DVC repo (just to store model versions to start) out of one of your examples instead of the DVC get started, how would we do that with the current implementation (through metadata + custom init)?

deliahu · on Nov 22, 2019

Hi, I'm one of the maintainers of Cortex

>> how does your system work in the IRIS or other examples you have? It looks like it hides under the hood a few things compared to this implementation - type of the model (tensorflow) and the logic that fetches it from S3, right?

That's exactly right. Cortex has three runtimes:

The Predictor runtime, which is used in this post, can run arbitrary Python. There is an optional key in `cortex.yaml` for Predictors called `model`, which is an S3 path to an exported model (or directory). If provided, Cortex will download the file/directory at that path and make it available as an argument in the `init(model_path, metadata)` function in your Predictor implementation (see here for the Predictor docs: https://www.cortex.dev/deployments/predictor)

The TensorFlow and ONNX runtimes behave a little differently (and similar to each other): `model` is a required field in the API config, and Cortex handles downloading the model and running inference against it. You may define a `request_handler`, which can contain pre- and post-request handling (here are the TensorFlow docs: https://www.cortex.dev/deployments/tensorflow, and here are the ONNX docs: https://www.cortex.dev/deployments/onnx)

>> Let's imagine we would want to make a DVC repo (just to store model versions to start) out of one of your examples instead of the DVC get started, how would we do that with the current implementation (through metadata + custom init)?

Yes, that is also exactly right - you'll have to use the Predictor runtime since that allows you to define how to download your model. You would specify metadata and leave out the `model` config field, similar to as done in this post. In `init(model_path, metadata)`, you would use the metadata to download and load the model

shcheklein · on Nov 22, 2019

Thank you! It all makes sense.

dmpetrov · on Nov 22, 2019

I really believe that "end-to-end AI platforms" of the future should be built on top of open-source tools like these two. I cannot wait to see when other parts of AI platforms will be integrated with the existing ones: model performance monitoring, data catalogs and etc..

Disclaimer: I'm one of the creators of DVC.

shcheklein · on Nov 21, 2019

Nice! Thanks for creating the integration. I'm one of the maintainers of the DVC project and would be happy to answer any questions.

renuka · on Nov 22, 2019

Interesting integration. I'm wondering how Cortex is different from Seldon?

keydunov · on Nov 22, 2019

Can Cortex deploy to other cloud providers besides AWS?

calebkaiser · on Nov 22, 2019

Currently we only support AWS—though we're working on supporting GCP as soon as possible. What platform are you using now?

keydunov · on Nov 22, 2019

Thanks! We're using GCP.