BankNext CaseStudy: GenAi Multi-VectorDB Switch Automation
GenAi automation for launch & switch between multiple locally deployed VectorDBs.
Problem Statement :
BankNext’s AI scenarios demand the usage of the most performant tools. VectorDBs form the base of semantic retrieval but incorrect DB selection by various teams results in erroneous benchmarking. The task of setting up and switching between multiple VectorDBs is tedious. Resulting in shaky confidence & suboptimal application behavior. BankNext needs an on-demand local VectorDB switching solution & it needs it fast.
Current State : Rigid & inflexible
1. Multiple teams are tasked to evaluate AI solutions for their use cases.
2. VectorDB selected is based on what is available.
3. The DB that is easy to setup is generally opted, without due diligence.
4. This DB turns out to be poorly suited for the specific use case.
5. This defeats the whole objective of this intensive exercise.
6. Unable to switch to the desired DB on demand, cripples innovation.
7. Results don’t add up. Business is agitated.
8. Houston, we have a problem!
Solution : Automate Gen Ai platform to dynamically switch VectorDBs
1. GitHub link
2. Scripting - .bat and bash
3. Local GIT setup
4. Local Maven setup
5. Java based tool for VectorDB interactions - LangChain4J
6. Embedding model - AllMiniLmL6V2EmbeddingModel
7. General purpose framework - SpringBoot/Java
8. Docker - my Docker runtime setup
9. Hardware - Local machine config higher than
RAM:16GB, Storage:10GB, Cores:4
System Design : Workflow
1- GIT clone from link to your local directory.
2- Run the automation script \gen-ai-local-vectordb-automation\setup.bat
3- Automation launches all the Vector DBs selected by you.
- ChromaDB
- ElasticSearch
- Redis
- Weaviate
- Qdrant
4- Builds & starts up the application.
5- Launches Swagger UI.
6- Application endpoints provide runtime DB switching mechanism.
7- VectorDB to connect can be specified in the qParam.
cd C:\temp
git clone https://github.com/vijayredkar/gen-ai-local-vectordb-automation.git
cd gen-ai-local-vectordb-automation
setup.bat
Execution Flow :
- provides 3 endpoints.
- insert text in to the chosen VectorDB.
- insert document in to the chosen VectorDB.
- extract semantic results from the chosen VectorDB.
2. Insert document
- insert operations chunk the data per configured size.
- connects dynamically to the DB specified in the qParam.
- creates new indexes per chunk in the DB at runtime.
- saves each data chunks in to distinct indexes.
- constructs DB index metadata & saves under /knowledge-repo
3. Extract semantic results
- extract operation reads all related index names from \knowledge-repo.
- Langchain4J executes semantic search on all these indexes.
- text segments of higher than the min desired score are extracted.
Runtime VectorDB Switching Mechanism :
- Every endpoint has a query param “vectorDbToConnect”.
- It takes in the desired VectorDB name eg. qdrant.
- Langchain4J dynamically establishes connection to the specified DB.
- On insert, DB indexes are created at runtime to save the text chunks.
- On fetch, all index names from \knowledge-repo for the doc are evaluated
Application Video:
Conclusion :
1- BankNext successfully automated multiple VectorDB launches.
2- Accomplished automation with single script invocation.
3- Created runtime VectorDB switching capability.
4- Effortlessly launched ChromaDB, Redis, ElasticSearch, Weaviate, Qdrant.
5- Provided multi-DB document ingestion, chunking and semantic search.
6- Substantially expedited benchmarking exercises across the org.
7- Provided high confidence in the final results.
8- Employed the VectorDB that is best suited for the specific use case.