Insights

China's Cyberspace Administration Proposes Draft Rules to Regulate Generative AI

Regulator proposes requiring content moderation, algorithmic transparency, data protection and security, nondiscrimination, and quality of training data

By William Wu

05.23.23

Regulators around the globe are paying close attention to the explosive growth of generative artificial intelligence ("AI") technology, such as ChatGPT. Indeed, as noted in our earlier blog, Italy's national privacy regulator recently lifted its ban on ChatGPT after OpenAI agreed to modify how it collects and processes data and to implement protocols to enhance data security to comply with the Italian authority's demands. In addition, other national authorities have inquired into OpenAI's data processing practices and compliance with Europe's General Data Protection Regulation.

On April 11, 2023, China's leading technology regulatory authority, the Cyberspace Administration of China ("CAC"), announced it is looking to preemptively regulate use of generative AI and published a proposed regulation: "Measures for the Administration of Generative Artificial Intelligence Services" (the "Proposed Rules"). The Proposed Rules were out for public comment through May 10 and are expected to go into effect sometime before the end of 2023. This article highlights a number of key areas for developers and users of generative AI technology (individual consumers and businesses) to consider if located in mainland China.

Definitions and Jurisdictional Scope

The Proposed Rules define generative AI as "the technology of using algorithms, models and rules to generate texts, pictures, sounds, videos, codes and other contents." In contrast to "deep synthesis" technology regulated under the Administrative Provisions on Deep Synthesis in Internet-based Information Services (the "Administrative Provisions on DSIS," another algorithms-related regulation promulgated by CAC in November 2022), these Proposed Rules appear to apply more broadly beyond algorithms to include "models and rules" used to generate content, which has potential implications for systems and processes that are generally not understood to fall within the scope of traditional generative AI systems.

Article 2 of the Proposed Rules articulates the jurisdictional scope of the Proposed Rules to those entities who "provide services to the public within the territory of the People's Republic of China." This approach suggests that no matter where companies that provide AI-generated content ("AIGC") are physically located, so long as such entities "provide services" to PRC's domestic users or allow such users to access the providers' AIGC services, the Proposed Rules will apply. It may be that for this reason ChatGPT does not currently allow users from "the territory of PRC" to register for an account or use its services in order to avoid China's laws.

Further, the Proposed Rules may be construed to regulate not only AIGC companies that directly provide services to PRC's domestic users but also those companies that may indirectly provide services to such users. For example, a U.S.-designed AIGC product that is not sold to "the public" but to a specific enterprise customer in China and which is customized or trained for a specific use case that is unique to such enterprise customer is likely covered by the Proposed Rules. If that enterprise customer then in turn offers that AIGC product directly to its customers or as part of an integrated product or service offering, that AIGC product could fall within the scope of the Proposed Rules.

Compliance Requirements

One of the provisions of the Proposed Rules requires that "any content generated by using generative AI … must not contain … false information…". This likely raises a significant compliance challenge. The first question is what is "false information"? A good portion of AIGCs, such as AI-generated articles, pictures and videos, are neither "true" nor "false" but instead reflect the product of user input, the data by which a model was trained, and the model parameters. If all of the AIGCs are held to an imprecise standard, that limitation may undermine a key objective of the generative AI technology: creativity and originality. Whether CAC will further explain what "false information" is in the near future remains to be seen, but one would hope that the principle would be narrowed to that information which is demonstrably false and intended to be deceptive.

In addition, the Proposed Rules would impose significant compliance obligations on providers of generative AI products or services. For example, Article 4 requires that generative AI providers ("providers") take measures to "prevent generating false information," "prevent discrimination based upon race, gender, ethnicity" (and similar factors), and respect intellectual property rights and business ethics. Further, the Proposed Rules prohibit use of these systems to "engage in unfair competition."

Article 5 states that providers (either organizations or individuals) "using generative AI product to provide services such as chat and text, image and sound generation, including supporting others to generate texts, images and sound on their own including supporting others by providing access to such capabilities via programmable interface (API) or other means, shall assume the responsibilities of the producer of such AIGC…" Under a plain reading of this language, someone who merely provides API services may be deemed a provider under the Proposed Rules. If so, an API service provider would be constructively providing generative AI services and could be deemed as indirectly "provid[ing] services to the public within the territory of the People's Republic of China" under Article 2. This could greatly expand the scope of applicability of the Proposed Rules.

One of many questions raised by the Proposed Rules is whether AIGC providers may also be held responsible for infringement by an AIGC producer upon others' lawful intellectual property rights. The Proposed Rules do not clearly address this issue and CAC should provide more clarity on this subject when it adopts final rules.

Security Assessment and Recordation

Pursuant to Article 6 of the Proposed Rules, before providing generative AI products or services to the public, providers shall submit to the State cyberspace department a security assessment in accordance with the Regulations on the Security Assessment of Internet Information Services with Public Opinion Attributes or Social Mobilization Capabilities and go through the procedures for record-filing, change, and cancellation of recordation of algorithms in accordance with the Provisions on Administration of Algorithm-based Recommendations in Internet Information Services. Thus, application of security assessment and recordation of algorithms will become threshold requirements once the Proposed Rules become effective later in the year.

However, if a provider that merely provides AIGC products or services via API has completed the security assessment and recorded its algorithms with the relevant governmental agency and subsequently provides such technology to its customer, a question remains whether the customer is still required to go through the security assessment and applicable recording procedure even if it packages the technology in a final product offered to ultimate users. This is a question for CAC to hopefully answer in the near future.

Training Data Duties and Responsibilities

One of the compliance risks of algorithm training is the nature of source data. Article 7 of the Proposed Rules clearly holds providers "responsible for legitimacy of the sources of pre-training data and optimized training data for generative AI products." This suggests that CAC may take a more proactive approach to monitoring the legitimacy and compliance of training data sources. Like Article 4, this article once again emphasizes "the authenticity, accuracy [and] objectivity" of the data. However, in practice, there are many scenarios where algorithms are trained based on synthesized or randomly created data. It is not clear if these training scenarios violate the Proposed Rules, unless the requirement for data authenticity and accuracy applies only to data collection but not to synthesized or randomly created data. The final rules will need to further specify these requirements.

Article 8 states that "where manual labeling is used in the development of generative AI products," providers shall set up a "clear, specific and operable labeling standard" to ensure a heightened accuracy requirement of trained models at the development stage.

Since protection of input information and usage history and restriction on user profiling and provision of input information fall within purview of legislation, such as PRC's Personal Information Protection Law and Electronic Commerce Law, Article 11 does not impose additional compliance responsibilities on providers as a personal information processor or service provider. Rather, according to the Personal Information Protection Law, individual consent is construed as a legitimate basis for a provider to lawfully profile users based on the input information or provide user's input information to a third party, which can be an exception to Article 11 of the Proposed Rules.

Article 13 imposes on providers a duty to establish a mechanism for addressing consumer complaints, including "correction, deletion and blocking of" consumers' personal information. This provision of the Proposed Rules further emphasizes that providers bear the "responsibilities of content producers." Under this article, "discovering and knowing" may imply "should know." Providers subject to Article 13 should closely scrutinize content generated by AI subject to a reasonable scope of monitoring and investigation achievable with existing technology to monitor and review the content generation stage.

On the basis of Article 4's requirements for the authenticity and accuracy of AIGC, Article 15 requires providers to use algorithm retraining or to modify model parameters to prevent repetitive output that violates the requirements of the Proposed Rules. However, by making whistleblowing behavior a sufficient ground to trigger the retraining or modification obligation, Article 15 could potentially impose a significant economic burden on providers.

Further, the Administrative Provisions on DSIS prescribe two types of marking: one creates marks or labels that do not affect users' use and are usually referred to as traceable marks, such as watermarks; and the other creates conspicuous marks that are only applicable to scenarios where generated content may cause confusion or misidentification by the public, such as AI dialogue, AI writing, synthetic human voice, human face generation, human face replacement, human face manipulation, gesture manipulation, immersive simulation scenes, etc. The identification obligation stipulated in Article 16 likely means the latter or conspicuous marks, and this provision expands application of the conspicuous marking obligation in the Administrative Provisions on DSIS because, regardless of whether or not AIGC products or services cause public confusion, they must be conspicuously identified.

In addition, Article 19 obligates providers to actively monitor users that violate laws, regulations, business ethics, or social morality when using generative AI products or services and suspend or terminate services to those users. However, providers should expect some challenges in operation, such as how to accurately identify unlawful or unethical user behaviors while still protecting user privacy.

If providers violate the Proposed Rules, Article 20 allows for possible criminal liability in accordance with PRC's Cyber Security Law, Data Security Law or Personal Information Protection Law or other applicable laws or regulations. Where there are no provisions in applicable laws or administrative regulations, this article grants discretion to the State's cybersecurity department or relevant competent departments to "give warnings, circulate criticism, and order corrections," and if providers do not comply, they may be suspended, their use of generative artificial intelligence to provide services terminated, and they may be fined or criminally prosecuted.

Takeaways

The proposed "Measures for the Administration of Generative Artificial Intelligence Services" clearly reflect the intention of China's regulator to provide directional guidance on the development of the algorithms-focused industry, such as generative AI and AIGC. However, many articles of the Proposed Rules are ambiguous, and the industry would benefit from further clarity from the regulator. In the meantime, these Proposed Rules will impose very high compliance obligations on providers of generative AI products or services in China.