A technical question - how does your system work in the IRIS or other examples you have? It looks like it hides under the hood a few things compared to this implementation - type of the model (tensorflow) and the logic that fetches it from S3, right? Let's imagine we would want to make a DVC repo (just to store model versions to start) out of one of your examples instead of the DVC get started, how would we do that with the current implementation (through metadata + custom init)?
>> how does your system work in the IRIS or other examples you have? It looks like it hides under the hood a few things compared to this implementation - type of the model (tensorflow) and the logic that fetches it from S3, right?
That's exactly right. Cortex has three runtimes:
The Predictor runtime, which is used in this post, can run arbitrary Python. There is an optional key in `cortex.yaml` for Predictors called `model`, which is an S3 path to an exported model (or directory). If provided, Cortex will download the file/directory at that path and make it available as an argument in the `init(model_path, metadata)` function in your Predictor implementation (see here for the Predictor docs: https://www.cortex.dev/deployments/predictor)
The TensorFlow and ONNX runtimes behave a little differently (and similar to each other): `model` is a required field in the API config, and Cortex handles downloading the model and running inference against it. You may define a `request_handler`, which can contain pre- and post-request handling (here are the TensorFlow docs: https://www.cortex.dev/deployments/tensorflow, and here are the ONNX docs: https://www.cortex.dev/deployments/onnx)
>> Let's imagine we would want to make a DVC repo (just to store model versions to start) out of one of your examples instead of the DVC get started, how would we do that with the current implementation (through metadata + custom init)?
Yes, that is also exactly right - you'll have to use the Predictor runtime since that allows you to define how to download your model. You would specify metadata and leave out the `model` config field, similar to as done in this post. In `init(model_path, metadata)`, you would use the metadata to download and load the model
I really believe that "end-to-end AI platforms" of the future should be built on top of open-source tools like these two. I cannot wait to see when other parts of AI platforms will be integrated with the existing ones: model performance monitoring, data catalogs and etc..