Angular model
With the iterations of large model technology, more large model technologies are being applied in different business scenarios.Rill Flow is designed to perform long running tasks that fully meet the characteristics of large model calls.
Rill Flow is able to take over large cloud models like ChatGPT, and can also handle large models for private deployment.
Exposure model via HTTP service
Big models usually reveal only C++ or python interfaces if direct cross-language calls are not friendly to business landing.
So we recommend using the HTTP protocol to encapsulate large model interfaces and FastAPI is a common HTTP framework for exposing large model interfaces.
Each model is deployed independently.
Each large model usually requires a specific software and hardware operating environment, while models and fine-tuned versions are rapidly iterating as a result of rapid developments in large modelling areas, where multiple large models are deployed within the same operating environment significantly increases the complexity of operations and models.
We therefore recommend the use of a large model of environmental deployment when operating independently based on Docker, K8S technology.
On the contrary, independent deployment of each large model also implies a need for greater deployment capacity.
Distribution Storage
A large number of files may need to be pulled or produced when using a large type of image or video.
File storage is not supported by the Context mechanism of Rill Flow; if there is a need to share files between different tasks, a distribution storage service is introduced, and storage addresses are passed through the context mechanism between task nodes.
Serverless
Big models themselves are characterized by high deployment costs and low requests, and they can improve the efficiency of the GPU resources by matching large model services to Serverlessness.