爱案例·全球AI诉讼风向标

一文看懂开源大模型NOTICE 文件：功能与要点

一、NOTICE 文件的渊源和功能

NOTICE文件区别于LICENSE文件，后者是授权条款，明确告知用户“可以怎么用、不能怎么用”，而NOTICE 文件是版权与归属说明，不是授权本身，是对版权和来源的补充说明。换句话说：LICENSE 告诉你“规则是什么”，NOTICE 告诉你“谁做的、用了哪些来源”。

二、实务作用

早期 NOTICE 文件用于 Java、Apache HTTP Server 等开源软件。随着 AI 和大模型的发展，NOTICE 文件被引入模型发布，目的是：

标明训练数据来源

标明依赖库及其许可证

说明模型权重的衍生关系

Hugging Face、PyTorch Hub 等开源平台要求模型卡或仓库提供 NOTICE 文件。NOTICE 文件已经成为开源 AI 模型合规、可追溯、责任透明的重要工具。

NOTICE 文件生态位置示意图（逻辑结构）

┌──────────────────┐

│ 开源大模型 │

│ (源码 + 权重) │

└──────────────────┘

│

│ 分发

▼

┌────────────────────────────────────────────┐

│ 分发包内容 │

└────────────────────────────────────────────┘

│ │ │

▼ ▼ ▼

┌─────────────┐ ┌─────────────┐ ┌─────────────┐

│ LICENSE │ │ NOTICE │ │ README / │

│ (授权条款) │ │(版权归属) │ │ 模型卡 │

└─────────────┘ └─────────────┘ └─────────────┘

│ │ │

▼ ▼ ▼

告诉用户可用方式告诉用户原作者说明用途、风险

权限、限制与版权归属及操作说明

│ │

└──────────┬────────┘

▼

第三方依赖 & 数据源

┌─────────────────────────┐

│ 第三方库、开源模型、数据 │

│ (版权/许可证/来源) │

└─────────────────────────┘

│

▼

NOTICE 文件引用说明

┌─────────────────────────┐

│ 标明第三方依赖名称、许可 │

└─────────────────────────┘

结构组成

对第三方开源组件或数据集，也应标明其版权及许可类型：

This model uses data from XYZ Dataset (CC-BY-4.0, 2019).

许可信息（License）

指明模型所遵循的开源协议（如 Apache 2.0、MIT）：

Licensed under the Apache License, Version 2.0

https://www.apache.org/licenses/LICENSE-2.0

来源说明（Attribution）

如果模型使用了第三方库或开源代码，需说明来源和用途，例如：

This model leverages Transformers library from Hugging Face (Apache 2.0)

and datasets from Common Crawl (CC-BY-4.0).

衍生说明（Optional）

如果模型基于其他开源模型训练，可注明衍生关系：

Derived from GPT-NeoX (Apache 2.0), modified for domain-specific tasks.

免责声明（Optional but recommended）

指出模型风险和使用限制，保护原作者和发布者责任：

This model is provided “as is” without warranties. Users are responsible for compliance with applicable laws.

四、NOTICE 文件撰写实务注意点

完整性

包括所有第三方组件和数据源，不可遗漏。尤其是涉及版权敏感数据（如文本、图像、音频），必须注明来源和许可。

清晰易读

尽量使用简单明了的文本，避免法律术语堆砌。建议分条列出不同数据或组件的版权信息，与 LICENSE 文件一致

NOTICE 文件仅作为补充，LICENSE 文件仍为主协议。

两者要保证协议类型、版权归属信息一致，避免混乱。

及时更新

模型迭代或新增数据集时，NOTICE 文件应同步更新。尤其是开源社区贡献的新模块或库加入时，不更新会导致版权风险。

分发方式

NOTICE 文件应随模型权重、源码仓库、模型卡一起分发，保证下游用户可获取。

四、NOTICE 文件示例（开源大模型）

This model and its associated code are licensed under the Apache License, Version 2.0

https://www.apache.org/licenses/LICENSE-2.0

--------------------------------------------------------------------------------

1. Model Description

--------------------------------------------------------------------------------

Model Name: [Model Name]

Version: [v1.0 / vX.X]

Description: [Brief description of the model, e.g., "A large language model for domain-specific text generation."]

--------------------------------------------------------------------------------

2. Source Code

--------------------------------------------------------------------------------

Source code repository: [GitHub / Hugging Face repo link]

License: Apache License 2.0

Additional notices:

[List any third-party libraries used in the code, e.g., Transformers library, PyTorch, etc.]

Example:

Transformers library by Hugging Face (Apache 2.0)

SentencePiece tokenizer (Apache 2.0)

Include copyright and license info from third-party libraries if required.

--------------------------------------------------------------------------------

3. Model Weights

--------------------------------------------------------------------------------

Model weights file(s): [link or file path]

License: Apache License 2.0

Attribution: [Original model authors if weights are derived from other models]

Example:

Derived from GPT-NeoX (Apache 2.0), modified for domain-specific tasks

--------------------------------------------------------------------------------

4. Training Data

--------------------------------------------------------------------------------

Data sources:

[Dataset Name] ([License], Year)

[Dataset Name] ([License], Year)

Notes:

Only legally available or authorized datasets were used.

Personal or sensitive information has been removed or anonymized.

Attribution example:

Common Crawl (CC-BY-4.0)

OpenWebText Corpus (CC-BY-SA 4.0)

--------------------------------------------------------------------------------

5. Third-Party Dependencies

--------------------------------------------------------------------------------

[Library / Tool Name] ([License])

Example:

Hugging Face Datasets (Apache 2.0)

PyTorch (BSD)

SentencePiece (Apache 2.0)

--------------------------------------------------------------------------------

6. Disclaimers

--------------------------------------------------------------------------------

This model is provided "as is" without warranties of any kind.

The authors and distributors are not responsible for any misuse, including generation of unlawful, harmful, or offensive content.

Users are responsible for compliance with applicable laws and ethical guidelines.

--------------------------------------------------------------------------------

7. Contact Information

--------------------------------------------------------------------------------

Maintainer / Author: [Name, Organization]

Email / Website: [Contact Info]

GitHub / Hugging Face Repository: [Link]

================================================================================

END OF NOTICE FILE

================================================================================

使用建议

替换所有占位信息：模型名称、作者、数据集、第三方库和许可证信息。

保持更新：模型迭代、数据集或依赖库更新时，同步更新 NOTICE 文件。

随模型分发：在 GitHub 仓库、模型权重文件夹、Hugging Face 模型卡中一并提供。

保持 LICENSE 与 NOTICE 一致：确保许可证类型与 NOTICE 文件中声明完全匹配。

一文看懂开源大模型NOTICE文件：功能与要点

一、NOTICE 文件的渊源和功能

二、实务作用

四、NOTICE 文件撰写实务注意点

四、NOTICE 文件示例（开源大模型）

1. Model Description

2. Source Code

3. Model Weights

4. Training Data

5. Third-Party Dependencies

6. Disclaimers

7. Contact Information