In this talk, I will present Flame on SLURM Backend to run Langchain tool calling with Bumblebee. Slurm is a Cluster Management System that is used in high-performance computing in University contexts and more. To allow users to request compute resources for scientific computation or AI training and inference. Starting with a brief introduction of what Slurm is and how to use it. Followed by an interactive Livebook session, I will present on-demand allocation of GPU resources on a Slurm Cluster in the University with Flame. With the GPU running, I present tool calling with Llama 3.1 in Elixir Langchain and Bumblebee.
Key Takeaways:
- How to use Flame on a HPC cluster