Categories: Agency News

GenVR Research is revolutionizing the Indic LLM space with innovative tools

GenVR Research has emerged as a dominant player in the Indic LLM space, releasing a series of innovative LLMs. On 18th March 2024, it unveiled the first Indic Large Action Model (LAM), Kaali. Unlike conventional LLMs, Large Action Models perform function calls based on prompts, enabling a chatbot to create images in response to user requests. This technology is pivotal in enabling multimodal LLMs like GPT-4.

GenVR Research’s Rachnatmak LLM, hosted on its platform, leverages this technology. As one of the first multimodal Indic LLMs in India, it can effectively respond to user queries and generate images when prompted. The Rachnatmak LLM incorporates GenVR Research’s indigenous Rupayan model, specializing in creating Indian images and art.

GenVR Research’s trailblazing LLM innovation dates back to November 2023, with the release of its Hinglish LLM suite based on llama-2-7B and llama-2-13b, which supported vernacular Hindi-Hinglish languages. On 28th January 2024, it further expanded its portfolio with the launch of Samvaad-LLM, a fine-tuned LLM with over 45 billion parameters supporting more than 9 Indic languages, marking it as one of the earliest in the market.

Furthermore, GenVR Research has open-sourced several Indic LLMs like Aryabhatta (21 March, 2024) and llamavaad (1 March 2024), based on Gemma-7B and llama-2-70B respectively, supporting up to 10 Indic languages.

To date, open-sourced Indic LLMs such as Airavata by AI4Bharat, OpenHathi by Sarvam AI, and Gajendra by BhabhaAI have been based on the 7 billion parameter llama base models. Smaller models tend to exhibit more hallucinations than larger ones. “Our llamavaad, a Hinglish-Hindi-English LLM based on the 70 billion parameter llama, effectively overcomes many of the hallucination challenges faced by smaller models. It has proven to be revolutionary during our customer deployments,” says Akshay Taneja, technology co-founder at GenVR Research.

The startup is also enhancing performance in existing LLMs. On 23rd February 2024, it introduced Buddhiman LLM, a high-performance model created using four top open-sourced LLMs—Qwen-1.5-72B, Deepseek-Maths-7B, Deepseek-Coder-Instruct, and Mixtral. It effectively improved performance on the MATHS benchmark in the Deepseek-Maths model from 51.7% to 54%, approaching the capabilities of advanced LLMs like Gemini Ultra and Claude 3 Opus.

With over 1 million user chats on its platform in more than 14 Indic languages, GenVR Research is actively working with B2B clients to deploy its technology for customer support. “We have seen more than a million chats without a single complaint about the model identifying itself as Chat-GPT, making inappropriate references to caste, overemphasizing India’s poverty, or failing to perform simple tasks,” says Akshay Taneja.

“As a leader in the film VFX and animation industry, co-founding GenVR Research has been a natural progression towards pioneering groundbreaking innovations. Our collaborative efforts have led to the development of RUPAYAN, a multilingual text-to-Hindi image model, created with the feedback of numerous artists and experts. This, along with our other advancements in multilingual Indian LLMs, demonstrates our commitment to driving digital inclusivity and pushing the boundaries of technology. These initiatives solidify GenVR Research’s position as an industry leader, shaping the future of language technology.” says Gitanjali Sehgal, business co-founder at GenVR Research and ex-CEO of Pixion.

GenVR Research’s suite of Indic LLMs and models like RUPAYAN are strategically designed to serve a wide spectrum of industries and sectors, emphasizing digital inclusivity across various geographic locations. Primarily, these cutting-edge tools are invaluable to post-production media companies, enabling the integration of regional linguistic nuances into their content creation process. Telecommunications companies can leverage these LLMs to offer customer service in a plethora of Indic languages, significantly enhancing the customer experience. Customer contact centers stand to benefit immensely by utilizing these models for multilingual support, improving both efficiency and customer satisfaction across India. Furthermore, content creators, educators, and developers are equipped with sophisticated tools to produce culturally resonant and language-specific content, fostering a more inclusive digital environment.

GenVR Research has also released over 250 million tokens of Indic language data on its GitHub platform, including the popular Indic positive dataset. This dataset is carefully curated to remove caste, religion, and anti-India bias from existing LLMs. With these open-source contributions, GenVR Research aims to create an unbiased LLM tool for India and revolutionize the Indic LLM space.

Try the tools here: https://app.genvrresearch.com/

admin