Interview with DeepSeek Founder Liang Wenfeng: "Our Destination is AGI, Not Just Making a Quick Buck"

Liang Wenfeng said: Excitement in AI research can't just be measured by money, like buying a piano—it's not only about affordability but also having people eager to play.

Interview with DeepSeek Founder Liang Wenfeng: "Our Destination is AGI, Not Just Making a Quick Buck"

In May 2024, DeepSeek Skyrocketed to Fame.The turning point came with the release of their open-source model, DeepSeek V2, which offered an unprecedented balance of cost and performance. This breakthrough sparked a price war among Chinese AI models.

With the launch of the V3 open-source model, DeepSeek grabbed the spotlight once again—this time on a global scale.

As the only non-corporate giant with a reserve of tens of thousands of A100 chips, DeepSeek has made many unconventional choices. It has eschewed the "all-in-one" approach, remaining steadfastly focused on research and technology. Remarkably, it has not ventured into consumer-facing applications, avoided full-scale commercialization, and instead embraced an open-source strategy—all without seeking external funding.

So, how did DeepSeek achieve this? The team at Dark Current(暗涌), a division of the Chinese tech media platform 36Kr, conducted interviews with Liang Wenfeng, DeepSeek's elusive founder, in May 2023 and July 2024.

(Note: This article is extensive, running over 6,000 words—consider bookmarking it for later reading.)

How Did the Price War for Large AI Models Begin?

Dark Current: After the release of DeepSeek V2, the large AI model market experienced an intense price war. Some have described DeepSeek as a "catfish" in the industry.
Liang Wenfeng: We didn’t intend to become a "catfish"—it just happened accidentally.

Dark Current: Were you surprised by this outcome?
Liang Wenfeng: Very surprised. We didn’t expect pricing to be such a sensitive issue. We simply followed our own pace, calculated costs, and set a fair price. Our principle has always been not to lose money but also not to make excessive profits. The price was just slightly above cost.

Dark Current: Five days later, Zhipu AI followed suit, and soon after, major companies like ByteDance, Alibaba, Baidu, and Tencent joined the fray.
Liang Wenfeng: Zhipu AI lowered the price of an entry-level product, but their models on the same level as ours were still expensive. ByteDance was the first to genuinely match our pricing on a flagship model, which triggered other major players to lower their prices. These large companies have much higher costs than us, so we didn’t expect anyone to lose money just to compete. It ended up resembling the burn-and-subsidize strategies of the internet era.

Dark Current: To outsiders, these price cuts look like an attempt to grab users, as is common in internet-era price wars.
Liang Wenfeng: Gaining users wasn’t our main goal. We reduced prices for two reasons: first, because our costs naturally decreased as we explored next-generation model structures; and second, we believe both APIs and AI should be affordable and accessible to everyone.

Dark Current: Before this, many Chinese companies directly copied the Llama structure for their applications. Why did DeepSeek choose to focus on model structures instead?
Liang Wenfeng: If your goal is to develop applications, then adopting the Llama structure and quickly launching a product makes sense. But our destination is AGI, which means we need to innovate new model structures to achieve stronger capabilities within limited resources. This is foundational research necessary for scaling up to larger models.

Besides model structures, we’ve conducted extensive research on other aspects, such as how to construct datasets and how to make models more human-like. These efforts are reflected in the models we’ve released. Moreover, the Llama structure is already about two generations behind the cutting-edge international standards in terms of training efficiency and inference costs.

Dark Current: Where does this gap primarily come from?
Liang Wenfeng: First, there’s a gap in training efficiency. We estimate that even the best domestic efforts lag behind global leaders by a factor of two in model structure and training dynamics. This means we need twice the computing power to achieve the same results. Additionally, there’s a data efficiency gap of about two times as well, meaning we require twice the training data and computing resources for equivalent performance. Combined, this results in needing four times the computing power. Our goal is to continuously narrow these gaps.

Dark Current: Most Chinese companies pursue both models and applications. Why has DeepSeek chosen to focus solely on research and exploration for now?
Liang Wenfeng: We believe the most important thing right now is to participate in the wave of global innovation. For many years, Chinese companies have been accustomed to others driving technological innovation while we focus on applying and monetizing it. But this shouldn’t be taken for granted. In this wave, our starting point is not to seize the opportunity to make a quick buck, but to push the boundaries of technology and contribute to the development of the ecosystem.

Dark Current: In the internet and mobile internet eras, there was a prevailing belief that the U.S. excels at technological innovation while China excels at applications.
Liang Wenfeng: We think that as China’s economy grows, it must also become a contributor rather than just a beneficiary. Over the past 30 years of the IT wave, we barely participated in real technological innovation. We’ve grown accustomed to Moore’s Law delivering better hardware and software every 18 months, as if by magic. The same mindset applies to Scaling Laws.

But in reality, these advancements are the result of generations of relentless effort by Western-led technology communities. Because we weren’t involved in the process, we’ve overlooked the work that went into them.

The True Gap Between China and the U.S. Lies in Imitation vs. Originality

Dark Current: Why did DeepSeek V2 surprise so many people in Silicon Valley?
Liang Wenfeng: In the U.S., where countless innovations happen daily, this is a fairly ordinary achievement. They were surprised because it came from a Chinese company that joined the game as an innovator rather than a follower. Most Chinese companies are used to following, not leading.

Dark Current: But in the Chinese context, this choice seems almost extravagant. Large AI models require heavy investment, and not every company can afford to prioritize research and innovation over immediate commercialization.
Liang Wenfeng: Innovation certainly isn’t cheap, and the tendency to "borrow and adopt" stems from China’s past circumstances. But now, if you look at China’s economic scale and the profits of companies like ByteDance and Tencent, they rank among the highest globally. What we lack isn’t capital—it’s confidence and the know-how to organize high-density talent into effective innovation teams.

Dark Current: Why do Chinese companies—including cash-rich giants—so often prioritize rapid commercialization over innovation?
Liang Wenfeng: Over the past 30 years, we’ve been solely focused on making money, largely ignoring innovation. But innovation isn’t purely driven by business; it also requires curiosity and a desire to create. We’re still bound by this past inertia, but it’s just a phase.

Dark Current: But you’re still a business, not a nonprofit research institution. You’ve chosen to innovate and then share your results through open-source channels. Where does your competitive moat come from? For instance, the MLA architecture innovation you introduced in May 2024 will likely be copied soon by others, right?
Liang Wenfeng: In the face of disruptive technologies, moats built through closed-source strategies are temporary. Even OpenAI, with its closed-source approach, can’t prevent others from catching up. That’s why we focus on embedding value within our team. As our colleagues grow through these processes, they accumulate know-how and build an organization and culture capable of continuous innovation—that’s our moat.

Open-sourcing and publishing papers don’t actually mean losing anything. For tech professionals, being followed is a huge accomplishment. Open-sourcing is more of a cultural act than a commercial one. Giving back is an added honor, and this approach also helps a company build a culturally appealing identity.

Dark Current: What’s your view on the market-first philosophy advocated by people like Zhu Xiaohu (a well-known Chinese internet investor)?
Liang Wenfeng: Zhu Xiaohu’s approach is self-consistent, but it’s more suited for companies aiming for quick profits. If you look at the most profitable companies in the U.S., they’re all high-tech companies that take a long-term, deeply rooted approach.

Dark Current: When it comes to large AI models, pure technological leadership rarely translates to absolute advantage. What’s the bigger picture you’re betting on?
Liang Wenfeng: We believe Chinese AI cannot remain in a perpetual follower role. People often say there’s a one- or two-year gap between Chinese AI and the U.S., but the real gap lies in originality versus imitation. If this doesn’t change, China will always be a follower. Some explorations are simply unavoidable.

NVIDIA’s leadership isn’t the result of a single company’s efforts but the collective work of the Western tech community and industry. They can foresee next-generation trends and hold detailed roadmaps. For Chinese AI to advance, we need a similar ecosystem. Many domestic chip projects struggle because they lack a supporting tech community and rely solely on second-hand information. For this reason, China must have pioneers at the forefront of technology.

DeepSeek: Building Large Models for Research and Exploration

Dark Current: Why did Quantum (DeepSeek's parent company) decide to get involved in developing large AI models? After all, it's a quantitative fund—why take on this kind of project?
Liang Wenfeng: Developing large models doesn’t have a direct connection to quant trading or finance. That’s why we created a separate company, DeepSeek, to pursue this endeavor. Many of Quantum’s core team members have backgrounds in AI, and over the years, we’ve worked on various challenges. We started with finance because it’s a highly complex domain, and general artificial intelligence (AGI) could be one of the next most challenging frontiers. For us, it wasn’t about "why do it" but rather "how to do it."

Dark Current: Are you building a general-purpose large model or focusing on a specific domain, like finance?
Liang Wenfeng: Our goal is AGI—general artificial intelligence. Language models are likely a crucial step toward AGI and already exhibit some AGI-like characteristics. So, we’re starting there. In the future, we’ll also expand into areas like vision.

Dark Current: Many startups have abandoned the pursuit of general-purpose large models, especially with big tech companies entering the space.
Liang Wenfeng: We’re not rushing to design applications based on the models. Our focus remains firmly on the development of large models themselves.

Dark Current: Some argue that it’s not the best time for startups to enter this space, given the involvement of big tech companies.
Liang Wenfeng: Right now, neither big tech companies nor startups can easily establish a dominant technical edge in the short term. With OpenAI setting the direction and much of the foundational research being publicly available, including papers and code, both groups will likely have their large language models ready by next year.

Big tech and startups each have their own opportunities. Current vertical application scenarios aren’t dominated by startups, making this phase challenging for them. However, because these scenarios are ultimately fragmented and decentralized, they’re actually better suited for the flexibility of startups.

In the long term, as the barriers to using large models decrease, startups will have opportunities at any point in the next 20 years. For us, the goal is clear: we’re here to research and explore, not to develop niche applications or vertical solutions.

Dark Current: Why do you define your mission as "research and exploration"?
Liang Wenfeng: It’s driven by curiosity. On a broader level, we aim to test certain hypotheses. For example, we believe that human intelligence might fundamentally be rooted in language—that human thought is essentially a process of weaving language together. What we perceive as thinking might just be our brain composing language. This suggests that human-like AGI could emerge from language models.

On a more immediate level, there are still many mysteries surrounding models like GPT-4. While replicating such models, we’re also conducting research to uncover these secrets.

Dark Current: But research involves significantly higher costs.
Liang Wenfeng: If we were just replicating existing models, we could rely on publicly available papers and open-source code, requiring only minimal training or fine-tuning—at a much lower cost. Research, on the other hand, involves extensive experimentation and comparison, requiring far more computational power and higher-caliber personnel, which drives up the cost.

Dark Current: How do you fund this research?
Liang Wenfeng: Quantum, as one of our investors, provides sufficient R&D funding. Additionally, Quantum has an annual philanthropic budget of several hundred million yuan, which typically goes to charitable organizations. If needed, we could allocate some of that funding toward this initiative.

Dark Current: But creating foundational large models is a game that requires at least $200–300 million. How do you ensure sustainable funding?
Liang Wenfeng: We’re in talks with various funding sources. Many venture capitalists (VCs) are hesitant about research-focused projects because they need an exit strategy and prefer quicker commercialization. Our research-first approach makes it hard to secure traditional VC funding. However, we already have computational power and an engineering team, which gives us a solid foundation to build on.

Dark Current: Have you explored potential business models?
Liang Wenfeng: Our current plan is to make most of our training results publicly available, enabling low-cost access to large models for more people—even small app developers. This approach could also create opportunities for commercialization while preventing the technology from being monopolized by a few companies.

Dark Current: Major tech companies will also offer services later. What sets you apart?
Liang Wenfeng: Big tech models are likely to be tied to their platforms or ecosystems, whereas ours will remain entirely independent.

Dark Current: Still, it seems a bit crazy for a commercial company to undertake open-ended, high-cost research without clear returns.
Liang Wenfeng: If you insist on finding a commercial justification, there probably isn’t one—it simply doesn’t make financial sense. Foundational research often has a low return on investment. When OpenAI’s early investors put in money, they weren’t thinking about the returns; they genuinely wanted to pursue this mission.

What we’re certain of is that we want to do this, we have the capability, and at this point in time, we’re one of the most suitable candidates to take on this challenge.

DeepSeek: Curiosity-Driven Innovation with GPU Investments

Dark Current: GPUs have become a scarce commodity in the ChatGPT-driven startup wave. Quantum foresaw this in 2021 and stocked 10,000 GPUs. Why?
Liang Wenfeng: It wasn’t a sudden decision. We gradually scaled up—from 1 GPU in the beginning to 100 GPUs in 2015, 1,000 in 2019, and then to 10,000. Initially, we hosted our GPUs in third-party data centers (IDCs), but as the scale grew, that approach no longer met our needs, leading us to build our own facilities. While this might seem like a decision driven by some hidden commercial logic, the primary driver was curiosity.

Dark Current: What kind of curiosity?
Liang Wenfeng: A curiosity about the boundaries of AI capabilities. For outsiders, the ChatGPT wave has been a seismic shift; for insiders, the real game-changer happened in 2012 with AlexNet. Its error rates were significantly lower than other models at the time, reigniting neural network research after decades of dormancy. While specific technical directions have evolved since then, the fundamental combination of models, data, and computational power remains constant. OpenAI's release of GPT-3 in 2020 made it clear that significant computational power was needed. When we started building our Firefly II supercomputer in 2021, most people couldn’t understand the vision behind it.

(Note: In 2020, Quantum invested ¥200 million to develop Firefly I, an AI supercomputer. In 2021, it committed ¥1 billion to build Firefly II.)

Dark Current: So, as early as 2012, you recognized the importance of computational power?
Liang Wenfeng: Researchers always crave more computational power. After conducting small-scale experiments, you naturally want to scale up. Since then, we’ve consciously worked toward deploying as much computational power as possible.

Dark Current: Some assume that building a computer cluster like yours is primarily to support machine learning for price predictions in quantitative trading.
Liang Wenfeng: If it were just for quantitative trading, we wouldn’t need many GPUs at all. Our research extends far beyond investments. We’re exploring whether there’s a paradigm that can fully describe the financial markets, whether simpler expressions exist, and the boundaries of different paradigms. This research may even have broader applicability beyond finance.

Dark Current: But this is undoubtedly a costly pursuit.
Liang Wenfeng: Truly exciting endeavors can’t always be measured purely by cost. It’s like buying a piano at home: first, you need to afford it, and second, there has to be a group of people eager to play it.

Dark Current: GPUs typically depreciate at a rate of 20% per year.
Liang Wenfeng: We haven’t done precise calculations, but the depreciation rate is likely much lower. NVIDIA GPUs hold their value well—even older models are still widely used. We’ve sold retired GPUs in the second-hand market and recouped a significant portion of their value.

Dark Current: Beyond the hardware cost, maintaining a computer cluster involves labor, maintenance, and electricity expenses.
Liang Wenfeng: Electricity and maintenance costs are actually very low—about 1% of the hardware cost annually. Labor costs are higher, but they’re also an investment in the future. Talent is the company’s greatest asset, and we hire people with genuine curiosity who are eager to conduct research.

Dark Current: In 2021, Quantum was one of the first companies in the Asia-Pacific region to secure NVIDIA A100 GPUs. How did you manage to beat some cloud providers to it?
Liang Wenfeng: We started researching, testing, and planning for new GPUs early on. As for some cloud providers, their demand was fragmented until 2022, when autonomous driving and other applications began requiring rented machines for training. Only then did they build their infrastructure. Large corporations are often driven by business needs rather than pure research or training.

Dark Current: How do you view the competitive landscape for large models?
Liang Wenfeng: Large corporations undoubtedly have advantages. However, if they can’t quickly apply their models, they might struggle to sustain their efforts because they need visible results. Leading startups often have strong technical foundations, but like earlier waves of AI startups, they face commercialization challenges.

Dark Current: Some people think a quantitative fund emphasizing AI is just a marketing gimmick for its other businesses.
Liang Wenfeng: Actually, our quantitative fund has largely stopped raising external capital.

Dark Current: How do you distinguish between AI believers and opportunists?
Liang Wenfeng: Believers are those who were here before, remain here now, and will still be here in the future. They are the ones who buy GPUs in bulk or sign long-term agreements with cloud providers rather than renting for the short term.

DeepSeek V2: Homegrown Talent Driving Innovation

Dark Current: Jack Clark, former policy director at OpenAI and co-founder of Anthropic, described DeepSeek as employing a "group of enigmatic geniuses." What kind of people created DeepSeek V2?
Liang Wenfeng: They aren’t enigmatic geniuses. Most of them are fresh graduates from top Chinese universities, PhD students in their fourth or fifth year of study, and young professionals just a few years out of school.

Dark Current: Many large-model companies are obsessed with recruiting talent from overseas. Some believe the top 50 experts in this field likely don’t work for Chinese companies. Where does your team come from?
Liang Wenfeng: None of the people who developed the V2 model are from overseas; they’re all local talents. While it’s true that the top 50 experts might not be in Chinese companies, we believe we can cultivate such talent ourselves.

Dark Current: The MLA (Multi-Head Latent Attention) innovation reportedly stemmed from a young researcher’s personal interest. Could you elaborate on how this breakthrough happened?

(DeepSeek’s innovative MLA architecture reduces memory consumption to just 5–13% of the commonly used MHA architecture.)

Liang Wenfeng: After identifying some patterns in the evolution of attention mechanisms, the researcher had a sudden inspiration to design an alternative. However, turning that idea into reality was a long process. We assembled a team specifically for this purpose, and it took several months of effort to get it working.

Dark Current: This kind of divergent creativity seems connected to your highly innovative organizational structure. During the Quantum era, you rarely assigned tasks from the top down. Does AGI’s uncertainty require more management oversight?
Liang Wenfeng: DeepSeek also operates entirely bottom-up. We avoid predefined divisions of labor, preferring natural division. Everyone brings their unique growth experiences and ideas, so they don’t need much supervision. During the exploration process, when someone encounters a problem, they naturally seek out others to discuss it. However, when an idea shows potential, we do allocate resources top-down to support it.

Dark Current: I’ve heard DeepSeek is highly flexible in allocating GPUs and personnel.
Liang Wenfeng: There’s no cap on the GPUs or human resources anyone can mobilize. If someone has an idea, they can call on the training cluster’s GPUs without needing approval. Additionally, since we don’t have rigid hierarchies or departmental silos, people can easily collaborate as long as the other person is interested.

Dark Current: Such a loosely managed approach relies heavily on recruiting passion-driven individuals. It’s said you excel at spotting unconventional talent through nontraditional metrics.
Liang Wenfeng: Our recruitment criteria have always been based on passion and curiosity. As a result, many of our hires have intriguing, unconventional experiences. Their enthusiasm for research often far outweighs their focus on monetary rewards.

Dark Current: Transformer was born in Google’s AI Lab, and ChatGPT at OpenAI. What’s the difference between the innovation potential of big tech AI labs and a startup like yours?
Liang Wenfeng: Whether it’s Google’s AI Lab, OpenAI, or even Chinese tech giants’ AI Labs, they all create immense value. The fact that OpenAI succeeded in bringing ChatGPT to life also involved a certain degree of historical serendipity.

Innovation and Belief: Shaping the Future with DeepSeek

Dark Current: Innovation often seems like a product of chance. I noticed the doors on both sides of your meeting rooms can be easily pushed open, leaving room for serendipity. It reminds me of how someone passing by overheard discussions during Transformer’s development and contributed to turning it into a general framework.
Liang Wenfeng: I believe innovation starts with belief. Why is Silicon Valley so innovative? Because they dare to try. When ChatGPT emerged, there was a lack of confidence in pioneering innovation in China—from investors to big companies, most felt the gap was too vast and focused on applications instead. But innovation demands self-confidence, which is often more apparent in young people.

Dark Current: Despite not participating in financing or making frequent public appearances, how do you ensure that DeepSeek becomes the go-to choice for top talent in large-model development?
Liang Wenfeng: Because we’re tackling the hardest problems. The most challenging issues in the world are naturally attractive to top talent. Hardcore innovation is rare in China, so top talent often doesn’t get recognized. By working on these problems, we create opportunities for them to shine.

Dark Current: OpenAI’s recent event didn’t unveil GPT-5, which led many to believe the pace of technological progress is slowing. Some have begun questioning the validity of the scaling laws. What’s your perspective?
Liang Wenfeng: We’re optimistic. The industry’s trajectory aligns with expectations. OpenAI is not infallible and cannot always lead the charge.

Dark Current: When do you think AGI will be achieved? Your team transitioned from dense models to MOE (Mixture of Experts) and released models focused on coding and mathematics before launching DeepSeek V2. What does your AGI roadmap look like?
Liang Wenfeng: It could take two years, five years, or even ten years—but it will happen within our lifetime. Internally, there’s no unified opinion on the roadmap. However, we’re betting on three directions: mathematics and code, multimodality, and natural language itself. Mathematics and code provide a closed, verifiable system—like Go—where high intelligence can emerge through self-learning. On the other hand, engaging with the real world via multimodal learning may be essential for AGI. We remain open to all possibilities.

Dark Current: What do you envision as the endgame for large models?
Liang Wenfeng: There will be specialized companies providing foundational models and services, forming a long chain of professional divisions. Others will build on top to meet diverse societal needs.

Dark Current: This past year saw shifts in China’s large-model startups. For instance, Wang Huiwen (co-founder of Meituan) exited mid-stage, and new companies began differentiating themselves.
(Wang Huiwen is a co-founder of Meituan, a Chinese food delivery giant.)
Liang Wenfeng: Wang Huiwen took on all the losses himself, ensuring everyone else exited unscathed. He made a choice that was unfavorable to himself but beneficial to others. I admire his integrity.

Dark Current: What currently occupies most of your energy?
Liang Wenfeng: Primarily researching the next generation of large models. There are still many unresolved challenges.

Dark Current: Other large-model startups focus on both technology and product development, aiming to capitalize on their technological edge during this window of opportunity. DeepSeek focuses exclusively on model research—does this mean the models aren’t ready yet?
Liang Wenfeng: All established approaches are products of the previous generation, which might not hold true for the future. Applying internet-era business logic to AI’s future profitability is like discussing General Electric or Coca-Cola during Tencent’s early days—it’s likely outdated thinking.

Dark Current: Your experience with Quantum’s growth was relatively smooth. Does that shape your optimism?
Liang Wenfeng: Quantum strengthened our confidence in technology-driven innovation, but it wasn’t always smooth sailing. We underwent a long period of accumulation. What people see are the successes post-2015, but we’ve been at this for 16 years.

Dark Current: Let’s talk about original innovation. As the economy slows and capital tightens, will this suppress original innovation?
Liang Wenfeng: Not necessarily. Structural adjustments in China’s economy increasingly depend on hardcore technological innovation. When people realize quick profits were often a result of favorable timing, they’ll be more inclined to pursue genuine innovation.

Dark Current: So you’re optimistic about this trend?
Liang Wenfeng: Yes. I grew up in a fifth-tier city in Guangdong during the 1980s. My father was an elementary school teacher. In the 1990s, Guangdong had many money-making opportunities, and some parents even told my father that studying was useless. But things have changed—making money isn’t easy anymore, and even opportunities like driving taxis are scarce. Within one generation, attitudes have shifted.

Hardcore innovation will become more prevalent. It’s not yet fully understood because society still needs to learn from experience. When people see innovators succeed, societal perceptions will change. We just need more examples and time.

Sustaining Innovation: DeepSeek’s Vision for AGI and Industry Ecosystems

Dark Current: DeepSeek currently embodies an early-stage idealism reminiscent of OpenAI, including its open-source approach. Will you consider transitioning to a closed model in the future? OpenAI and Mistral both shifted from open-source to closed-source at some point.
Liang Wenfeng: We won’t go closed-source. We believe establishing a strong technical ecosystem is more important in the early stages.

Dark Current: Do you have any financing plans? Media reports suggest that Quantum has plans for spinning off and listing DeepSeek independently. In Silicon Valley, AI startups eventually bind themselves to major companies.
Liang Wenfeng: We have no short-term financing plans. Our challenge has never been funding but rather the restrictions on high-end chip exports.

Dark Current: Many believe AGI development and quant investing are fundamentally different. While quant can thrive in secrecy, AGI might require a high-profile approach with alliances to expand resources. Doesn’t bigger input lead to more innovation?
Liang Wenfeng: More resources don’t necessarily equate to more innovation. Otherwise, big corporations would monopolize all innovation.

Dark Current: You’re currently not pursuing applications—is that because you lack operational expertise?
Liang Wenfeng: We believe this is a period of technological innovation, not application explosions. In the long run, we aim to foster an ecosystem where the industry directly uses our technology and outputs. Our focus will remain on foundational models and frontier innovation, while other companies build B2B or B2C businesses atop DeepSeek. If this upstream-downstream industry ecosystem forms, there’s no need for us to develop applications ourselves. That said, we could, but research and innovation will always be our top priority.

Dark Current: If the industry adopts APIs, why choose DeepSeek over offerings from major tech companies?
Liang Wenfeng: The future likely belongs to specialized divisions of labor. Foundational models require sustained innovation. Large corporations have their limits and aren’t necessarily well-suited for this role.

Dark Current: Can technology truly create a significant competitive gap? You’ve said before there are no absolute technological secrets.
Liang Wenfeng: While there are no secrets, resetting and catching up requires time and resources. For example, Nvidia’s GPUs have no theoretical secrets and seem easy to replicate, yet organizing a team to pursue and match their next-generation technologies takes significant effort. This creates a wide moat in practice.

Dark Current: After your price reductions, ByteDance followed suit, indicating they felt competitive pressure. How do you view new strategies for startups competing with big tech?
Liang Wenfeng: Honestly, we’re indifferent to this. It was something we did incidentally, not a primary goal. Providing cloud services isn’t our main focus; achieving AGI is. Currently, I don’t see any groundbreaking new approaches, nor do large companies have an obvious advantage. While they have existing user bases, their cash-flow businesses can become liabilities, making them more prone to disruption.

Dark Current: How do you see the endgame for the six other large-model startups outside of DeepSeek?
Liang Wenfeng: Maybe 2–3 will survive. Right now, they’re all in a money-burning phase, so those with clear positioning and precision in operations will have better chances. Others might transform drastically. Valuable components won’t disappear but may take on new forms.

Dark Current: During the Quantum era, your stance on competition was described as "independent and self-assured," rarely engaging in horizontal comparisons. What is your fundamental perspective on competition?
Liang Wenfeng: I often reflect on whether something improves societal operational efficiency and whether it allows us to find our strengths within the industry’s division of labor. If the endgame increases efficiency, it’s valid. Most competitive dynamics are transitional. Over-focusing on them can be distracting.

Self-Driven Innovation: The Human Element in DeepSeek’s Model

Dark Current: How is DeepSeek progressing with recruitment?
Liang Wenfeng: The initial team is in place. Since manpower was limited in the early stages, we temporarily borrowed some team members from Quantum. We began hiring as soon as ChatGPT-3.5 became popular, but we still need more people to join us.

Dark Current: Talent in the large-model startup space is scarce. Some investors claim that the most suitable candidates are likely only found in AI labs at giants like OpenAI or Facebook AI Research. Will you recruit such talent from overseas?
Liang Wenfeng: For short-term goals, hiring experienced individuals makes sense. But for the long run, experience is less important than fundamental abilities, creativity, and passion. From this perspective, there are many suitable candidates domestically.

Dark Current: Why is experience less critical?
Liang Wenfeng: It’s not always the case that only those with experience can accomplish a task. At Quantum, we focused on evaluating abilities rather than experience. For core technical roles, we often hire fresh graduates or those with just a year or two of work experience.

Dark Current: Is experience a hindrance in innovation-driven work?
Liang Wenfeng: Experienced individuals often jump to pre-defined solutions, whereas those without experience explore multiple possibilities and carefully think through their approach, often finding solutions better suited to the current context.

Dark Current: Quantum transitioned from an industry outsider to a leading player in quantitative investing within a few years. Was this hiring philosophy a key factor?
Liang Wenfeng: Our core team, myself included, had no prior experience in quant. It’s not necessarily a secret to our success, but it’s a cultural trait of Quantum. We don’t avoid hiring experienced people, but ability is what we prioritize.

Take sales as an example: our two top-performing salespeople had no prior industry experience. One used to work in German mechanical goods exports, and the other was a backend developer at a brokerage. They started with no experience, resources, or network but helped us become the only private equity firm relying primarily on direct sales.

Dark Current: Why haven’t others succeeded in replicating this?
Liang Wenfeng: Because innovation isn’t just about copying a single element; it requires alignment with company culture and management. For instance, it took them a year to achieve any results, but we don’t use conventional KPIs or task assignments.

Dark Current: What standards do you use for evaluation?
Liang Wenfeng: We don’t emphasize immediate metrics like order volume. Instead, we encourage salespeople to expand their networks and build trust. A good salesperson might not close deals quickly but establishes credibility and long-term relationships.

Dark Current: Once you find the right people, how do you help them integrate?
Liang Wenfeng: We assign them critical responsibilities without micromanaging. They are free to devise their own methods. Building a company’s DNA is hard to replicate—how to identify potential, nurture growth, and evaluate success are all unique to the organization.

Dark Current: What’s essential for fostering an innovative organization?
Liang Wenfeng: Innovation thrives with minimal interference and management, allowing individuals freedom and room for trial and error. Innovation isn’t planned or taught—it emerges organically.

Dark Current: How do you ensure effectiveness and alignment with goals under such a loose structure?
Liang Wenfeng: By ensuring alignment in values during recruitment and reinforcing them through company culture. We don’t have a formalized corporate culture because codifying things often stifles innovation. Instead, leadership sets the tone by leading through example.

Dark Current: Could an innovative organizational structure be a game-changer for startups competing with large corporations in this wave of AI development?
Liang Wenfeng: If you follow textbook methods, startups will seem doomed in today’s environment. But markets change. The real determinant is not predefined rules or conditions but the ability to adapt and evolve. Many large corporations struggle to respond quickly, weighed down by previous experiences and inertia. In this wave of AI innovation, a new class of companies is bound to emerge.

Dark Current: What excites you most about this journey?
Liang Wenfeng: Discovering whether our hypotheses are correct—and if they are, that’s incredibly exciting.

Dark Current: What’s non-negotiable in this round of hiring?
Liang Wenfeng: Passion and strong foundational abilities. Everything else is secondary.

Dark Current: Are such people easy to find?
Liang Wenfeng: Their passion often makes them stand out because they genuinely want to do this. These people are usually looking for us, too.

Dark Current: Building large models demands unending investment. Does the cost make you hesitant?
Liang Wenfeng: Innovation is expensive and inefficient, often involving waste. That’s why it only becomes viable when economies reach a certain level of development. OpenAI burned through significant resources to achieve its breakthroughs.

Dark Current: Do you ever feel like you’re doing something crazy?
Liang Wenfeng: I’m not sure if it’s crazy, but many things in this world defy logical explanation. For example, many programmers tirelessly contribute to open-source projects out of sheer enthusiasm.

Dark Current: Is there a sense of spiritual reward in that?
Liang Wenfeng: It’s like hiking 50 kilometers—your body is exhausted, but your spirit feels fulfilled.

Dark Current: Can curiosity-driven passion last indefinitely?
Liang Wenfeng: Not everyone can maintain such intensity throughout their lives, but most people can devote themselves fully to something during their younger years without being overly driven by material goals.

End of Article.

Original:https://finance.sina.com.cn/tech/2025-01-26/doc-inehhksk9178057.shtml