Tether's New Medical AI Models Enhance Local Processing and Data Privacy

By Patricia Miller

May 07, 2026

2 min read

Tether has launched QVAC MedPsy models for mobile devices, enhancing medical AI efficiency while ensuring data privacy.

Tether’s AI Research Group has introduced new medical language models, QVAC MedPsy-1.7B and MedPsy-4B, designed for low-power devices such as smartphones and wearables. These advanced models aim to provide effective medical reasoning while ensuring data privacy and efficient local execution.

The healthcare AI market is expanding rapidly, projected to grow from approximately $36 billion today to over $500 billion by 2033. Traditional AI solutions require sensitive patient data to be stored and processed off-site, often in the cloud. This reliance poses risks regarding privacy and compliance. In response, Tether has focused on developing solutions that keep medical data secure by enabling local processing of information directly on devices.

How do the QVAC MedPsy models perform in comparison to existing medical AI systems? The 1.7B model produced a noteworthy score of 62.62 across seven medical benchmarks, outperforming Google’s MedGemma-1.5-4B-it by over 11 points despite its smaller size. Furthermore, it excelled in real-world clinical scenarios such as HealthBench Hard, showcasing capability that rivals larger competitors.

The 4B model achieved an impressive 70.54 on the same benchmarks, significantly exceeding the performance of MedGemma-27B, which is nearly seven times larger. The performance metrics cover various assessments, including MedQA, MedMCQA, and HealthBench, utilizing a methodology that combines supervised training with curated clinical reasoning data and reinforcement learning techniques.

The focus of Tether’s team has been on enhancing efficiency rather than merely increasing model size. Key features of these models include quick response times and concise yet complete answers, ensuring both functionality and battery savings for users. The models are available in compressed formats suited for mobile devices without sacrificing performance.

Specifically, the 4B model generates responses in around 909 tokens, contrasting sharply with comparable systems that average approximately 2,953 tokens. Similarly, the 1.7B model achieves an average of 1,110 tokens compared to 1,901 tokens from similar models, demonstrating significant efficiency improvements.

Both models are offered in the quantized GGUF format, with compressed file sizes of about 1.2GB for the 1.7B model and 2.6GB for the 4B model. This efficiency is vital as it minimizes computational demand, latency, and costs, allowing the models to function on standard hardware within healthcare environments. With these advancements, healthcare institutions can leverage AI for medical reasoning without the complexities or security risks associated with cloud reliance.

The QVAC MedPsy models are now freely available under an open license on Hugging Face, marking a significant step in advancing AI applications in healthcare.

Important Notice And Disclaimer

This article does not provide any financial advice and is not a recommendation to deal in any securities or product. Investments may fall in value and an investor may lose some or all of their investment. Past performance is not an indicator of future performance.