Interview with Dr. Jiannan Ouyang, Co-founder and CEO of Snarkify
Share on

Alice Liu: Dr. Ouyang, thank you for sitting down with me today. I’m really excited to dive into your work at Snarkify and discuss the broader implications of ZK hardware acceleration. To start, could you share a bit about your background and how you got into the Web3 space, particularly ZK?

Dr. Jiannan Ouyang: Thanks for having me, Alice. My journey into Web3 started back in 2017 when I was working as a research scientist at Meta - then known as Facebook. At the time, I got involved with crypto mining because of my background in high-performance computing. The idea of optimizing performance to directly translate into revenue was fascinating to me. That led me to co-found Open Mining, which focused on FPGA and GPU acceleration for proof-of-work computations.Later, in 2021, we pivoted towards ZK as its potential became increasingly clear. Many of our learnings from crypto mining, particularly around hardware acceleration using GPUs and FPGAs, helped inform our approach. We made a strategic decision early on to focus on GPUs, avoiding FPGAs for proving systems and being skeptical of ASIC viability at that stage. Since then, Snarkify has been dedicated to proving useful applications for ZK rather than mining-related computations.

Alice: It’s interesting how you were able to apply your learnings from crypto mining to ZK. You touched on the differences between FPGA, GPU, and ASIC-based approaches. Can you break down the trade-offs between them in the context of ZK acceleration?

Dr. Ouyang: Absolutely. The key factor in choosing between these hardware approaches is cost. GPUs, thanks to NVIDIA’s massive market cap and broad applicability across industries, benefit from significant R&D investment and advanced manufacturing processes. This makes them both powerful and cost-effective. FPGAs, on the other hand, are more expensive due to their lower production volume and reliance on older manufacturing nodes. Their programmability is also a challenge - implementing an algorithm on a GPU might take a day or two, while doing the same on an FPGA could take weeks. We experienced this first-hand when working on ZPrize, where our GPU implementation was developed quickly, whereas the FPGA development took months.

ASICs offer potential performance benefits, but their programmability is extremely limited, and developing a flexible, efficient ZK-friendly ASIC is a challenge in itself. This is why we have continued to bet on GPUs - they offer a balance of performance, accessibility, and adaptability that’s difficult for ASICs or FPGAs to match.

Alice: Given that perspective, do you see GPUs continuing to dominate in the future, or do you think we’ll see a mix of different approaches as ASIC development progresses?

Dr. Ouyang: GPUs will continue to dominate until the ZK market is large enough to justify ASIC production at scale. Right now, the market isn't big enough to sustain an ASIC-focused business model, but that could change over time. It’s not a question of if, but when.

Alice Liu: That makes sense. I also wanted to ask about ZKVM hardware co-design. Many teams are optimizing for CPU first and then moving to accelerators later. What are your thoughts on this approach? Are there limitations to designing for CPUs first before moving to GPUs or ASICs?

Dr. Ouyang: That’s a great question. The key lesson we’ve learned is that ZK acceleration must be approached end-to-end. Initially, many efforts focused on accelerating only specific computational kernels, like MSM and NTT. However, simply optimizing these kernels isn’t enough to significantly speed up the entire proving process.

What we’ve done at Snarkify is to move the entire proof generation pipeline to GPUs. This eliminates CPU-GPU communication overhead, which is a major bottleneck. In our recent ZPrize submission, we achieved a 900x speedup by keeping the entire computation on a single GPU. However, this isn’t yet feasible for most ZKVMs because they consume too much memory to fit within a single GPU. The next step is to redesign proof systems and ZKVMs so they can scale across multiple GPUs while minimizing communication overhead. That’s why treating accelerators as first-class citizens—rather than an afterthought—is critical.

Alice: That makes a lot of sense. You mentioned bottlenecks in the proof generation pipeline - where do you see the most significant ones today, and how can hardware acceleration help mitigate them?

Dr. Ouyang: Right now, the biggest bottleneck is data movement between CPUs and GPUs. Many ZK workloads require hundreds of gigabytes of memory, and moving that data over PCIe to GPUs creates a major performance drag. Even if we make GPUs 10x faster, they’ll still be waiting on data transfer. The same applies to ASICs - if we don’t resolve the data movement issue, even the fastest ASICs will be sitting idle.

Alice: That really underscores why ZKVMs need to be designed for accelerators from the start rather than being optimized for CPUs first. Shifting gears a bit, many people associate the ZK endgame with real-time proving. How do you define real-time proving, and what applications do you think it will enable?

Dr. Ouyang: Real-time proving, as defined by Vitalik and Justin Drake, means generating proofs within the block time. This allows each L1 block to be proven before the next one is produced. The biggest application for this is in Layer 2s - if they can prove their blocks within the same 12-second Ethereum block time, it enables real-time cross-rollup communication without the need for bridges, reducing fragmentation. Many teams, including Scroll and zkSync, are actively working on this.

Alice: That’s a huge step for interoperability. Speaking of usability, Snarkify has focused heavily on developer experience. Can you share more about your approach and how you’re helping developers adopt ZK?

Dr. Ouyang: We took a unique approach by prioritizing developer experience from the beginning. Our Elastic Prover, which lets developers deploy their code to GitHub and get automatic proof generation, was ready over a year ago. Initially, adoption was slow because projects were still in the R&D phase. But now that they’re moving to production, we’re seeing increased demand for performance and cost optimization.

Alice: That’s great to hear. Snarkify also emphasizes decentralizing the proving network. What are the main benefits of decentralizing proof generation, and what challenges come with it?

Dr. Ouyang: Decentralization primarily helps reduce proving costs by creating a more competitive supply. It also enables real-time payments for proof generation, rather than relying on traditional Web2 billing models. However, decentralization for its own sake doesn’t always add value - what matters is whether it improves cost, security, and efficiency.

Alice Liu: That’s a pragmatic approach. One last question - what are your long-term predictions for ZK over the next 5 to 10 years?

Dr. Ouyang: In five years, I expect ZKVMs to become mainstream, allowing developers to write applications without needing deep cryptographic expertise. In ten years, I see ZK extending beyond blockchain into industries like finance and healthcare, enabling verifiable computation at scale. I chose to work in ZK because I believe in that future, and I’m excited to help build it.

Alice: That’s a fantastic vision. Thank you so much for this conversation, Dr. Ouyang. It’s been great to hear your insights on ZK hardware acceleration and the future of the space.

Dr. Ouyang: Thanks, Alice. I appreciate the discussion!

Follow @drouyang and @Snarkify_ZKP on X for the latest updates on his work.

More articles
Interview
Interview with Jeremy Bruestle, CEO of RISC Zero
Read More
February 7, 2025
Reports
zkBitcoin Latest
Read More
January 8, 2025