Secure Private AI Inference Routing Layer Developed by Researchers
In the realm of sensitive industries such as healthcare and finance, the desire to harness the power of large AI models without compromising private data has led researchers to develop a novel approach.
Secure Multi-Party Computation (MPC) and its Limitations
By leveraging a cryptographic technique called Secure Multi-Party Computation (MPC), organizations can split their data into encrypted fragments, distribute them across multiple servers, and compute results without any single server accessing the raw input.
However, this method comes with a significant caveat: speed. Traditional mid-sized language models that return results in under a second can take over 60 seconds when processed under MPC due to the substantial encryption overhead.
Input-Adaptive Routing and Existing Solutions
Prior solutions have attempted to mitigate this issue by redesigning AI models to operate efficiently under encryption. While these efforts have helped, they share a fundamental limitation: every query, regardless of complexity, must pass through the same model at the same cost.
This is where the concept of input-adaptive routing becomes essential.
SecureRouter: A Novel Solution
To address this conundrum, researchers at the University of Central Florida developed a system called SecureRouter. This innovative solution introduces a lightweight routing component that evaluates each incoming encrypted query and selects the most suitable model from a pool of various sizes to process it.
The routing decision remains encrypted, ensuring that neither the client nor the server gains insight into the input data.
Impressive Results
When tested against SecFormer, a private inference system employing a fixed large model, SecureRouter demonstrated impressive speed improvements, reducing average inference time by 1.95 times across five language understanding tasks.
The speedups ranged from 1.83 to 2.19, depending on the task complexity. Compared to running a large model on every query, SecureRouter achieved an average speedup of 1.53 times across eight benchmark tasks.
Accuracy and Implications
The accuracy of SecureRouter remained remarkably close to the large-model baseline, except for a notable drop in performance on a grammatical analysis task. This highlights the importance of matching model size to query complexity.
The implications of SecureRouter are significant, as it enables organizations to leverage large AI models while maintaining the confidentiality of sensitive data.
By sitting atop existing MPC frameworks and utilizing standard language model architectures, SecureRouter does not require extensive infrastructure changes. Clients receive only the final result, unaware of which model processed their query.
This innovation paves the way for widespread adoption of private AI inference in high-stakes industries.
The authors conclude that SecureRouter offers a promising solution for balancing security and efficiency in private AI inference, enabling organizations to harness the power of large AI models while protecting sensitive data.