Whatsapp
Copy link

As AI activities such as scraping and crawling become pervasive, publishing companies in Asia are fighting to protect their valuable IP and their futures. Brian Yap and Byung Jin Park report

From ChatGPT and Microsoft Copilot to Google Gemini, artificial intelligence (AI) technologies have become part of the daily lives of many people around the world, not least in Asia. For businesses that rely on the creation and distribution of often specialist information for revenue, AI’s ability to scrape these gems and reproduce them in altered, often enhanced, formats is evolutionary. Why access these websites when AI can provide a faster, more complete product?

The conundrum presents a fight or flight moment, perhaps more succinctly, embrace or repel a technology that may spell out your doom.

Nick Ellison, a London-based managing director in the advanced analytics business at New York-headquartered global risk and financial advisory company Kroll, observes how AI technologies have introduced both extraordinary potential and new categories of risk.

“In recent years, some of the most pressing concerns have centred around intellectual property misuse, the loss of attribution and context for original content, and reputational distortion arising from inaccurate or misleading AI outputs,” says Ellison.Nick Ellison

Specialising in software and technology IP matters including the AI misuse of IP, data privacy and group litigation, he adds that these risks affect creators, publishers and AI developers alike, blurring longstanding boundaries between ownership, authorship and accountability.

It is not surprising that companies in the media, publishing and content creation industries have been most vocal about the impact of AI on their business operations and revenue.

“Based on my observation, most complaints about these issues (copyright infringement and so on) in Japan have come from media companies,” says Shinnosuke Fukuoka, a Tokyo-based partner at Nishimura & Asahi who handles AI, big data and Internet of Things (IoT) matters in robotics and AI.

A quick search through related news in the past year – news reported and written by actual journalists, not AI – shows that complaints about these issues have also been raised in several other countries across Asia.

Shinnosuke Fukuoka, Nishimura & Asahi

Some of these complaints have reached the courts.

In August 2025, Japan’s major newspapers Nikkei, Asahi Shimbun and Yomiuri Shimbun, filed legal proceedings against US-based AI search engine Perplexity AI in the Tokyo District Court for copyright infringement and unfair competition.

In South Korea, three major broadcasters – KBS, MBC and SBS – filed a lawsuit in January 2025 against domestic internet portal operator Naver in the Seoul Central District Court, seeking to stop alleged copyright infringement and unfair competition related to the use of their content for generative AI training, as well as to claim damages.

These cases followed Indian news agency Asian News International’s (ANI) lawsuit in November 2024 against US-based AI research and deployment company OpenAI, the creator of ChatGPT, in Delhi High Court for copyright infringement, false attribution and improper use of its content.

All three cases are ongoing, but proving copyright infringement in court is difficult. “From the perspective of copyright holders, it is nearly impossible to demonstrate how AI companies, using advanced technology, accessed their data, copied it and incorporated it into AI training,” says Min Seon Shim, an IP partner at Barun Law in Seoul.

Min Seon Shim, Barun Law

Shim notes that only by proving this process can one claim that their works were used and demand compensation, which is beyond the capability of most ordinary companies without AI expertise.

This is even more challenging for small and medium-sized media companies, often specialist or niche publications, which generally face resource constraints compared to larger peers.

“From experience advising SME publishers, common issues include [lacking] the legal and technical capacity to monitor AI usage or pursue litigation … [and] difficulty in proving how data was ingested and used,” says Nishimura & Asahi’s Fukuoka.

Fukuoka also points out that many SMEs underestimate the risk of AI scraping, or assume existing copyright laws offer strong protection, while fearing loss of relevance if access is restricted too aggressively.

He adds that proving AI copyright infringement in Japan is difficult because AI developers rarely disclose training data resources and Japan lacks broad pre-trial discovery, unlike US litigation, thereby limiting access to internal AI training records.

“Even if outputs resemble original works, proving that similarity results from reliance (not coincidence) is hard …This evidentiary burden means SMEs often cannot meet the threshold for infringement claims,” says Fukuoka, who acts as a committee member of the Ministry of Economy, Trade and Industry Investigative Commission on the AI and Data Contract Guidelines Review Committee.

Defence of rights

This raises the question of whether media companies – big or small – can defend their business and revenue against pervasive AI activities such as data scraping.

The Organisation for Economic Co-operation and Development (OECD) AI Policy Observatory is an online platform launched by the OECD in 2020 to promote trustworthy, human-centric AI. In its 24 March 2024 article titled “The AI data scraping challenge: How can we proceed responsibly?”, the observatory defined data scraping as “using web crawlers or other means to obtain data from third-party websites or social media properties”. It added that “today’s large language models (LLMs) depend on vast amounts of scraped data for training, and potentially other purposes.”

The Australian Financial Review, owned by media conglomerate Nine Entertainment, reported in August 2025 that Nine had been tracking bot and crawler activity on its websites. In June 2025, Nine’s websites were reportedly crawled almost 10 times every second by all AI firms, with OpenAI contributing the most. The good news is that there are technical and legal measures that news organisations can take to protect their copyrighted content. But here is the catch – in Asia, any course of action will vary depending on the jurisdiction.

Kroll’s Ellison says that regional responses to AI differ significantly.

In the US, the approach is legalistic and assertive, with rights holders quick to challenge unlicensed use through the courts. The UK focuses on preventative measures, using policy and technical controls to mitigate risk. Europe prefers a regulation-first approach, embedding rules directly into law, rather than relying solely on judicial outcomes.

In Asia, the picture is more cautious and fragmented. “Policymakers are generally taking incremental, consultative steps, balancing innovation with protection and avoiding sweeping regulation,” says Ellison. “The result is a patchwork of national frameworks, each evolving at a different pace.”

He says that this makes it harder for publishers operating across multiple jurisdictions, as they must navigate diverse standards, varying enforcement timelines and inconsistent licensing norms.

Singapore stands out as a unique market for AI in Asia. It has had a fair use exception similar to the US since 2004, under the US-Singapore Free Trade Agreement.

Under the Copyright Act 2021 of Singapore, whether any use constitutes fair use is to be determined based on four non-exhaustive factors including the purpose and character of the use, and whether the use is of a commercial nature, or is for non-profit educational purposes.

Pin-Ping Oh, an IP partner in the Singapore office of Bird & Bird, says: “The fair use exception can potentially apply to allow AI training without a licence, taking a leaf from what the US courts have done. However, our courts have not had the opportunity to rule on this point.”

In 2021, Singapore introduced a so-called text and data mining (TDM) exception, also known as the computational data analysis (CDA) exception, under the Copyright Act, 2021. CDA is defined as including “using a computer program to identify, extract and analyse information or data from a work or recording”, and “using the work or recording as an example of a type of information or data to improve the functioning of a computer program in relation to that type of information or data”.

Oh notes that some copyright owners are concerned that Singapore’s TDM exception is too broad and does not adequately protect their rights.

“Singapore’s TDM is broader than the equivalent exceptions in the UK and the EU because it applies to both non-commercial and commercial purposes, with no purpose limitation such as for research only,” she says. “Furthermore, there are restrictions on the right to opt out of the application of the TDM exception. Any term in a Singapore law-governed contract that purports to exclude or restrict the operation of the TDM exception is void. This also applies to foreign law contracts in certain circumstances.”

The main safeguard under the TDM exception is the “lawful access” to content. This term appears in the Singapore Copyright Act but is not defined, creating uncertainty about its precise meaning. The act only provides examples, for instance, access by circumventing paywalls would not be considered “lawful access”.

“This is a very important safeguard because it allows for copyright owners to retain some level of control over their material, and to prevent it from falling [under the TDM exception],” says Oh.

Besides Singapore, Japan is currently the only other jurisdiction in Asia with a TDM exception under its copyright law, but it does not have a fair use clause. Japan amended its Copyright Act in 2018 to include article 30-4, allowing businesses to use copyrighted material for AI development and other information analysis purposes.

Under this provision, companies can freely use copyrighted material for information analysis, which includes AI development and the use of AI technology, meaning that companies can collect material from the internet or other sources and feed them into AI systems for development.

Japan also has no specific AI law that imposes strict obligations on companies developing or using AI. The only existing AI-related act mainly places obligations on the government to study AI technology and establish an appropriate legal framework in the future.

Although the Copyright Act allows the use of copyrighted material for AI development, it also states that businesses must not unreasonably harm the interests of copyright owners. Collecting and using data in itself is not considered unreasonable harm, says Daisuke Tatsuno, a partner in the IP tech group at Baker McKenzie’s Tokyo office.

“However, determining when an AI developer has unreasonably harmed a copyright owner’s interests remains unclear,” says Tatsuno, who focuses his practice on the registration, protection, dispute and licensing of IP rights.

For media companies seeking to protect their editorial content in Japan, Kensaku Takase, a partner and IP tech practice group head at Baker McKenzie in Tokyo, says that one possible action is to include a clear statement prohibiting data collection for AI development purposes, such as implementing a “robots.txt” file on the server.

However, Takase, who also serves as Baker McKenzie’s global IP head, notes that this alone may not suffice because the act allows data collection and use unless it “unreasonably harms the copyright owner’s interests”. Implementing access barriers can strengthen the argument that data collection and use reach this level of harm, making the exceptions inapplicable.

Takase also suggests placing content behind a paywall, or protecting it with user ID and password authentication, which further supports claims that unauthorised data collection causes unreasonable harm.

You must be a subscribersubscribersubscribersubscriber to read this content, please subscribesubscribesubscribesubscribe today.

For group subscribers, please click here to access.
Interested in group subscription? Please contact us.

你需要登录去解锁本文内容。欢迎注册账号。如果想阅读月刊所有文章,欢迎成为我们的订阅会员成为我们的订阅会员

已有集团订阅,可点击此处继续浏览。
如对集团订阅感兴趣,请联络我们

Whatsapp
Copy link