As AI advances, high-fidelity AI training data has become critical to improving model performance. However, with real-world data becoming increasingly scarce, the synthetic data sector has gained significant traction.DataArc has emerged as a breakout player in the synthetic data space, securing multiple rounds of funding in its first year, with total funding approaching eight figures.

Since its founding in 2025, DataArc’s rapid ascent has captured the attention of the global investment community. The company has garnered backing from prominent venture capital firms, including Innoangel Fund and Oriental Fortune Capital—a testament to the strong institutional confidence in DataArc’s specialized approach to the synthetic data market.

A Strong Partnership: Incubated by IDEA, Built on a Top Technical Team

DataArc’s rapid growth is closely tied to its strong core team and incubation support. CEO Jiang Xuhui and CTO Xu Chengjin previously worked as research scientists at the IDEA Research Institute. During their time there, they built a strong research track record, received full institutional backing, and formally launched their entrepreneurial journey.

Jiang Xuhui, CEO (second from right), at GITEX-ENS 2025 in Dubai. (Photo: DataArc)

The IDEA Research Institute was founded by renowned scholar Harry Shum, former Microsoft Executive Vice President. It is an international, cutting-edge research organization focused on the digital economy, with a mission to incubate innovative companies that combine world-class R&D with proven commercialization capability. DataArc is a prime example of this mission.

Jiang and Xu represent a new generation of AI researchers translating academic breakthroughs into commercial solutions. According to their Google Scholar profiles, the DataArc team has published more than 100 papers in leading venues such as ICLR, EMNLP, and AAAI. The team brings together academic contributions from top institutions in China and abroad, and combines them with hands-on experience from leading internet companies, AI unicorns, and top investment firms—supporting both world-class R&D and strong commercialization execution.

Breaking Through: Targeting an Open Market with an End-to-End Synthetic Data Offering

DataArc stands out not only for its team, but also for its clear reading of industry trends and disciplined execution. Projections from organizations such as Epoch AI suggest that usable human-generated text data on the internet could be largely exhausted by 2028, making data scarcity an increasingly serious constraint on further model gains. In this context, synthetic data is emerging as a mission-critical strategy to address both data availability and compliance requirements, and the sector is entering a period of rapid growth.

To meet rising demand, DataArc has focused on synthetic data and moved quickly toward commercialization. It has launched SynData Platform, a tailored one-stop commercial platform for synthetic data generation, offering customized solutions for enterprise users across a range of industry verticals. DataArc has also open-sourced the DataArc-SynData Toolkit (SDT) framework, available for free on GitHub, helping lower adoption barriers and broaden access to synthetic data capabilities.

Commercialization: Recognized Across Sectors, Advancing AI Upgrades with Synthetic Data

Compared with traditional real-world data, synthetic data offers greater controllability, stronger compliance readiness, and broader flexibility. It is well suited for use cases where data is limited, demand is high, and regulatory requirements are strict.

Leveraging its enterprise-grade solutions, DataArc has secured significant commercial mandates from a diverse portfolio of global industry leaders across finance, high-end manufacturing, and cloud infrastructure. Currently, the company is steadily advancing its commercial deployment in the Middle East and other key international markets, establishing strategic partnerships with national-level digital government agencies, leading cloud service providers, and major telecom operators. This expansion serves as a critical milestone in DataArc’s global market validation.

The latest funding round will be used to increase R&D investment, expand global market reach, and build a more complete product ecosystem. DataArc said it will keep synthetic data at the center of its strategy, deepen its work in vertical industry scenarios, and continue upgrading its technology and products. The company aims to help enterprises improve model performance and accelerate adoption of AI in real-world business applications, contributing to higher-quality growth in the digital economy.

Github:https://github.com/DataArcTech

Media Contact

Company Name:
DataArc

Contact Person:
Christina

Email:

Country:
China

Website:

Information contained on this page is provided by an independent third-party content provider. XPRMedia and this Site make no warranties or representations in connection therewith. If you are affiliated with this page and would like it removed please contact [email protected]