braindump ... cause thread-dumps are not enough ;)

kubeflow evaluation notes

  • platform for machine learning which is supposed to streamline the whole process of data prep, researching, coding a model and then hosting it as an API
  • youtube playlist with Kubeflow 101
  • it focuses mostly on standardizing the whole flow of work, joins together different tools, e.g. the jupterlab and KFserving
  • kubeflow works by “extending” kubernetes into a stack/environment for ML
    • it enhances it with CRDs which fall into the ML domain (like jupter notebooks, or ML processing pipelines)
    • Operators make sure Kubernetes executes what is defined in CRDs
    • It uses Kubernetes more like a framework to implement something
    • The issue seems to be that the “abstraction” level isn’t fully adjusted to use-cases (e.g. I find using Airflow for scheduling a more mature/featurefull tool)
  • runs everywhere kubernetes runs
  • there is a nice web app - the dashboard, which allows to manage the resources
  • provides managed jupyter notebooks (you can start them from the central dashboard)
    • in the kube
  • resources can be split between teams in form of namespaces
  • there is a pipeline component, where you plugin different steps of model processing pipeline (data collection, training, serving etc) as Kubernetes resources (I guess CRDs)
  • you can easily serve your model by using one of available serving platforms (KFserving or Seldom Core)
  • easiest way to do a test install - through microk8s:
      $ sudo snap install microk8s --classic
      $ microk8s status --wait-ready
      $ microk8s enable kubeflow
    
  • provides an abstraction for “pipelines” which are essentially DAGs of pods:
    • you specify the docker image
    • you specify the input (parameters, data etc)
    • you specify what is the output (file, model etc)
      • output of one of the steps can become input of others
    • you define how the steps are executed in python in a notebook
    • you trigger a run from the notebook (this makes it also visible in the kubeflow dashboard)
    • once deployed to k8s, pipelines are visible
    • video which shows an example of a pipeline
  • has a built in metadata store
    • can be interacted with through a Python SDK
    • link to docs
    • kubeflow tools write a bunch of info in there (e.g. pipelines will write info about each run execution)
    • contains an artifact store which can be used e.g. to host models (“model store”)
  • KFserving
    • takes a definition + a ML model and turns it into an API
    • supports GPUs, autoscaling, canary roll-outs
    • built on top of Istio and Knative
    • you specify the model (effectively the api config) as a k8s CRD
  • kubeflow pipelines disadvantages:
    • you need to create a docker image for every step (or have a template one)
    • the pipeline needs to be compiled from python code