untitled

(1)

2004 情財第 845 号

Part4: IETF における標準化動向

2005 年 3 月

(2)

登録商標等について

• Microsoft、MS、Windows、Windows 2000、Windows NT、Windows XP、Windows ロゴ、Internet Explorer、Outlook、Outlook Express などは、米国Microsoft Corporation の米国およびその他の国における登録商標または商標である。

• Sun Microsystems、Sun ロゴ、Java コーヒーカップロゴ、Solaris、 Java、JDK などは、米国 Sun Microsystems の米国およびその他の国における登録商標または商標である。

• その他、本文小児記載されている会社名、商品名、製品名などは、一般に各社の商標または登録商標である。

(3)

3.3.1 Nameprep... 11 3.4 セキュリティエリアでの標準化...13 3.4.1 SSH WG における標準化 ...14 3.4.2 SASL WG における標準化 ...14 3.4.3 SASLprep ...14 3.5 PKIX WG での標準化 ...17 3.5.1 RFC 2459/3280 における証明書の国際化 ...18 3.5.2 I-D 3280bis における証明書の国際化...21 4 まとめ...21 参考文献...23

(4)

図表目次

図 1 ISOC/IETF/IESG/エリアディレクタ/WG の関係...1 図 2 IETF での国際化の概念...3 図 3 StringPrep に関連する I-D/RFC ...8 図 4 証明書の規格の関連図 ...18 表 1 DirectoryString で表現できる国際化文字列の種類 ...2 表 2 RFC 3454 と I-D における禁止文字の相違 ...10

(5)

1 はじめに

本報告書では、IETF における PKI における公開鍵証明書に対しての複数バイト文字セットに関する標準化の状況をまとめる。 1.1 IETF とは IETF1_{は、インターネットソサエティ}_{(インターネット学会)}2_{の下部組織でありイ} ンターネットで使われるプロトコルの標準化を行っている団体である。 IETF は、作業領域を 7 つのエリアに分けており各エリアにエリアディレクタを設けエリアディレクタの指導の下に標準化の作業を行っている。各エリアは複数のWG を配下に置き該当エリアでの活動を行っている。エリアディレクタとIETF/IAB3_により_IESG4_{が形成されて、インターネット全} 体での技術面での舵取りを行っている。 ISOC/IETF/IESG/エリアディレクタ/WG の関係を図 1 に示す。

Application General Internet Operationsand

Management Routing Security Transport

WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG WG IESG IAB AD AD ADAD ADAD ADAD ADAD ADAD ADAD Chair

Chair ChairChair ChairChair ChairChair ChairChair ChairChair ChairChair Chair Chair ISOC メンバーを公認資金面でのバックアップ ADとIABのチェアをメンバーとする

IETF

図 1 ISOC/IETF/IESG/エリアディレクタ/WG の関係

1_{Internet Engineering Task Force, http://www.ietf.org/} 2_{ISOC, Internet Society, http://www.isoc.org/}

3_{Internet Architecture Board, http://www.iab.org/iab/}

(6)

2 証明書の複数バイト文字セットの標準化

証明書はもともと、ディレクトリシステム(電話帳システム)として標準化が進められていたISO/IEC/ITU-T ISO 9594/X.500 シリーズの認証に関するフレームワークであるISO/IEC/ITU-T ISO 9594-8/X.509[1] を元にしている。

これらの規格は、ISO5_と_IEC6_および_ITU7_{の下部組織である}_ITU-T8_の_{3 者によ}

り標準化が進められたものである。 X.509 は 1996 年に X.509 v3 が制定され拡張領域の扱いが定められた。国際規格として標準化が進められた背景もあり、もともと国際化についても検討されている。端的にいえば格納する各エントリの名前に関して下の様に定義されている。 DirectoryString ::= CHOICE {

teletexString TeletexString (SIZE (1..MAX)), printableString PrintableString (SIZE (1..MAX)), universalString UniversalString (SIZE (1..MAX)), utf8String UTF8String (SIZE (1..MAX)), bmpString BMPString (SIZE (1..MAX)) } これらの文字列で格納可能な文字列を表 1 にあげる。表 1 DirectoryString で表現できる国際化文字列の種類文字列の種類格納できる文字 PrintableString いわゆるAlphanumeric TeletexString ISO-2022 を使い国際化文字列を格納可能 UniversalString UCS-4 を使い国際化文字列を格納可能

5_{International Organization for Standardization, 国際標準化機構} 6_{International Electrotechnical Commission, 国際電気標準会議} 7_{International Telecommunication Union, 国際電気通信連合}

(7)

BMPString UCS-2 を使い国際化文字列を格納可能 UTF8String ISO-10646 を UTF-8 エンコードで格納可能

3 IETF での国際化の流れ

IETF におけるインターネットプロトコル上のキャラクタセットに関する国際化の流れを本節で説明する。

IETF において、新しいプロトコルの国際化に関しては Unicode をベースとした ISO-10646[2] をキャラクタセットとして採用し、ISO-10646 を UTF-8 と呼ばれる方式でエンコードしたものを利用することが推奨されている。

さらに、文字列としてどう扱うかに関してはstringprep と呼ばれる RFC 3454 “Preparation of Internationalized Strings ("stringprep")”[3] が定義されている。このRFC 3454 は国際化文字列をどう検索･比較を行うべきかを考察し、手順を標準化したものである。このstringprep を各プロトコルの都合に合わせてプロファイルを作成したものを実際のプロトコルの中で利用することになる。これらの関係の概念図を図 2 に示す。

キャラクタセット

stringprep

nameprep

xxxprep

図 2 IETF での国際化の概念

(8)

3.1 国際化文字コードに対する標準化(RFC 2277/2278/3629)

国際化文字コードに関してのIETF での標準化の状況は RFC 2277/2278/3629 の3 つの RFC により定義されている。

3.1.1 RFC 2277 “IETF Policy on Character Sets and Languages” BCP18

RFC 2277(BCP18)[4] は 1998 年 1 月に制定されたものであり、IAB が 1996 年 2 月 29 日より 3 月 1 日までに行った ”IAB Character Set Workshop”の勧告(RFC 2130 として公開されている)を参考にして作られたものである。

このRFC 2277 では、インターネットプロトコルが使うすべての文字データに関してどの文字セットが使われているかを明らかにすることを求めている。 All protocols MUST identify, for all character data, which charset is

in use.

RFC 2277 P.1 3.1. What charset to use より転載

また、プロトコルはUTF-8 を扱えなければならないと定義している。 Protocols MUST be able to use the UTF-8 charset, which consists of

the ISO 10646 coded character set combined with the UTF-8 character encoding scheme, as defined in [10646] Annex R (published in

Amendment 2), for all text.

RFC 2277 P.2 3.1. What charset to use より転載

キャラクタセットを指定する仕組みとして、Language を指定する仕組みに関する標準であるRFC 1766 で策定されている Language tag を利用することが提案されている。またPOSIX の Locale を使うことによっても Language の指定が出来ることが示されている。

4.3. How to identify a language

The RFC 1766 language tag is at the moment the most flexible tool available for identifying a language; protocols SHOULD use this, or provide clear and solid justification for doing otherwise in the document.

(9)

Note also that a language is distinct from a POSIX locale; a POSIX locale identifies a set of cultural conventions, which may imply a language (the POSIX or "C" locale of course do not), while a language tag as described in RFC 1766 identifies only a language.

RFC 2277 P.5 4.3. How to identify a language より転載

3.1.2 RFC 2278 “IANA Charset Registration Procedures” BCP19 RFC 2278(BCP19)[5] は、1998 年 1 月に制定されたものである。

このRFC 2278 は、IANA(Internet Assigned Number Authority)に対して新しいキャラクタセットを登録するための処理を定義している。そのために2.3 節にてキャラクタセット(charset)の定義を明確にしている。

2.3. Charset

The term "charset" (see historical note below) is used here to refer to a method of converting a sequence of octets into a sequence of characters. This conversion may also optionally produce additional control information such as directionality indicators.

Note that unconditional and unambiguous conversion in the other direction is not required, in that not all characters may be

representable by a given charset and a charset may provide more than one sequence of octets to represent a particular sequence of

characters.

This definition is intended to allow charsets to be defined in a variety of different ways, from simple single-table mappings such as US-ASCII to complex table switching methods such as those that use ISO 2022's techniques, to be used as charsets. However, the definition associated with a charset name must fully specify the mapping to be performed. In particular, use of external profiling information to determine the exact mapping is not permitted.

(10)

to describe such straightforward schemes as US-ASCII and ISO-8859-1 which consist of a small set of characters and a simple one-to-one mapping from single octets to single characters. Multi-octet character encoding schemes and switching techniques make the

situation much more complex. As such, the definition of this term was revised to emphasize both the conversion aspect of the process, and the term itself has been changed to "charset" to emphasize that it is not, after all, just a set of characters. A discussion of these issues as well as specification of standard terminology for use in the IETF appears in RFC 2130.

RFC 2278 P.2 2.3 Charset より転載実際に作成したcharset を IANA に登録するための処理に関しては 4 節で定義されている。ここで述べられているように、charset の登録には正規の処理手段は無く、ただ単にインターネットコミュニティに対してコメントとチェックを行える機会を提供するための処理として位置づけられている。 4. Registration Procedure

The following procedure has been implemented by the IANA for review and approval of new charsets. This is not a formal standards process, but rather an administrative procedure intended to allow community comment and sanity checking without excessive time delay.

RFC 2278 P.6 4. Registration Procedure より転載

3.1.3 RFC 3629 “UTF-8, a transformation format of ISO 10646” STD63 RFC 3629(STD63[6] は、2003 年 11 月に制定されたものである。

このRFC 3629 は、ISO 10646 をインターネット上で使うためのフォーマットとしてUTF-8 を定義しているものである。

Abstract

(11)

Character Set (UCS) which encompasses most of the world's writing systems. The originally proposed encodings of the UCS, however, were not compatible with many current applications and protocols, and this has led to the development of UTF-8, the object of this memo. UTF-8 has the characteristic of preserving the full US-ASCII range,

providing compatibility with file systems, parsers and other software that rely on US-ASCII values but are transparent to other values. This memo obsoletes and replaces RFC 2279.

RFC 3629 P.1 Abstract より転載

3.2 LDAPbis WG での標準化

LDAP においては、X.500 ディレクトリにおける Subject の表現に関して UTF-8 を使うことが提案されている。LDAP においては、検索用途に使う事例が多く文字列のマッチに関しての志向が強いことに特徴がある。

LDAPbis WG において制定された stringprep は、文字列の表現形式の正規化(特に比較・検索のための便宜を図ることに重点が置かれている)を定めたものであり、 IDN(Internationalized Domain Name)[11] においても利用されている。

3.2.1 StringPrep の概要

本節ではRFC 3454 (通称 StringPrep。以降 StringPrep と称する)[3] に関しての解説を行う。

StringPrep は、LDAPbis WG にて LDAP のデータの国際化を行う際に文字列比較、文字列の正規化を行うためのルールとして定められた。

StringPrep は、当初 IDN において国際化文字列の利用を行うために Paul Hoffman 氏(VPN Consortium)の個人 I-D として 2001 年 9 月 27 日に

draft-hoffman-stringprep-00.txt として発行され 2002 年 10 月 4 日に

draft-hoffman-stringprep-07.txt となり、最終的に 2002 年 12 月に RFC 3454 としてPROPOSED STANDARD 化された。

現在では、RFC 化された RFC 3454 をベースに LDAPbis WG にて LDAP v3 プロトコルにおける文字列の扱いとして3 つの draft-ietf-ldapbis-strprep-05.txt (以降I-D StringPrep として称する) [15] 、draft-ietf-ldapbis-dn-16.txt(以降、I-D StringPrep-DN と称する) [16] および draft-ietf-ldapbis-filter-09.txt(以降、I-D StringPrep-filter と称する) [17] が議論されている。

(12)

これらの3 つの I-D は、I-D StringPrep が RFC 3454 の直接の後継 I-D として国際化文字列の扱いについて、I-D StringPrep-DN が LDAP における

Distinguished Name の扱いについて、I-D StringPrep-filter が LDAP におけるフィルタ文字列の扱いについての標準化を定めている。これらのI-D/RFC の関係を図 3 に示す。 RFC 3454 2002/12 I-D draft-ietf-ldapbis-strprep-05.txt I-D draft-hoffman-stringprep-00.txt 2001/9/27 I-D draft-hoffman-stringprep-07.txt 2002/10/4 IDNに関する議論の一環としてPaul Hoffman氏の個人I-Dとして発行 LDAPbis-WGのI-Dとして発行 I-D draft-ietf-ldapbis-dn-16.txt 独自のI-Dとして作成されているがRFC 3454に影響を受けている。 I-D draft-ietf-ldapbis-filter-09.txt 独自のI-Dとして作成されているがRFC 3454に影響を受けている。 RFC 3454 2002/12 I-D draft-ietf-ldapbis-strprep-05.txt I-D draft-hoffman-stringprep-00.txt 2001/9/27 I-D draft-hoffman-stringprep-07.txt 2002/10/4 IDNに関する議論の一環としてPaul Hoffman氏の個人I-Dとして発行 I-D draft-hoffman-stringprep-00.txt 2001/9/27 I-D draft-hoffman-stringprep-07.txt 2002/10/4 I-D draft-hoffman-stringprep-00.txt 2001/9/27 I-D draft-hoffman-stringprep-07.txt 2002/10/4 IDNに関する議論の一環としてPaul Hoffman氏の個人I-Dとして発行 LDAPbis-WGのI-Dとして発行 I-D draft-ietf-ldapbis-dn-16.txt 独自のI-Dとして作成されているがRFC 3454に影響を受けている。 I-D draft-ietf-ldapbis-dn-16.txt 独自のI-Dとして作成されているがRFC 3454に影響を受けている。 I-D draft-ietf-ldapbis-filter-09.txt 独自のI-Dとして作成されているがRFC 3454に影響を受けている。 I-D draft-ietf-ldapbis-filter-09.txt 独自のI-Dとして作成されているがRFC 3454に影響を受けている。図 3 StringPrep に関連する I-D/RFC StringPrep の大まかな処理の流れは、 1. Unicode への変換 2. 同一と見なす文字のマッピング 3. 正規化 4. 禁止文字のチェック 5. bidirectional 文字のチェック 6. 意味論的に重要ではない文字の削除となる。以降の章で、StringPrep の処理を順を追って説明する。

(13)

3.2.1.1 Unicode への変換

Unicode への変換は、I-D StringPrep では第 2 章として独立した章をなし、きちんと定義されているが、RFC 3454 では明示的な定義はされていない。以下の手順はI-D StringPrep での処理である。

変換する文字列の種別により変換方法が異なる。変換方式は以下のようになる。 1. TeletexString の Unicode への変換は 2.1.節 Transcode において定義

されており、変換はLocal matter であるとされている。 2. PrintableString はそのままダイレクトに変換する。 3. Unicode ベースの文字列は、そのまま変換するので UniversalString / UTF8String / BMPString はそのままとなる。 4. 制御文字の一部の削除する(削除対象の制御文字群は 2.2 節に列挙されている)。 5. また一部の制御文字はスペースへ変換する(2.2 節に列挙されている) 基本的には、視覚効果(改行、水平および垂直タブコードなど)があるものはスペースに変換それ以外は削除という基準の様である。 3.2.1.2 同一文字のマッピング同一文字のマッピングは、表に基づいてマッピングを行う。マッピング結果は、一文字とは限らず複数文字へ展開される場合もある RFC 3454 の Appendix A-D に、マッピングに用いる表が定義されている。 3.2.1.3 Commonly mapped to nothing

RFC 3454 の Table B.1 には、いわゆる” Commonly mapped to nothing”と呼ばれる、入力から削除される文字が定義されている。

3.2.1.4 正規化

正規化の処理は、RFC 3454 と I-D StringPrep では微妙に異なる。

RFC 3454 では、UAX15[12] で定義されている KC を用いるが、I-D StringPrep では、UAX15[12] で定義されている NFKC を使って正規化を行う。

3.2.1.5 禁止文字のチェック

RFC 3454 と I-D StringPrep では、表 2 のように禁止文字が微妙に異なる。比較において禁止文字が含まれる場合は、比較用文字列の生成自体に失敗し、結果として文字列の比較は「相違」となる。

(14)

さらに、右から左に書く様な文字はRFC 3454 では禁止文字とされている(RFC 3454 および UAX9)。最後に、意味論的に重要ではない文字の削除を行う。「意味論的に重要でない」は、比較方式により結果が異なる。RFC 3454 と I-D StringPrep では、3 種類の比較の手法(純粋文字列比較/数値文字列比較/電話番号文字列比較)を想定している。 1. 先頭、最後につく空白 (space のみ)は削除 2. マップ後に処理する 3. 中間で連続する空白は一つに 4. 比較方法に合わせて、無意味な文字を削除となる。表 2 RFC 3454 と I-D における禁止文字の相違禁止文字内容未使用コード(RFC 3454 テーブル A.1) 表示特性変更コード(RFC 3454 テーブル C.8) プライベートコード(RFC 3454 テーブル C.3) 非文字コード(RFC 3454 テーブル C.4) サロゲートコード(RFC 3454 テーブル C.5) I-D 代替文字 U+FFFD Space 制御コードプライベートコードサロゲートコード RFC 3454 平文では不適切な文字

(15)

正規表現には不適切な文字表示特性変更コード

タグつき文字

3.3 IDN(Internationalized Domain Name)に関しての標準化

IETF での国際化の標準として IDN(Internationalize Domain Name)の標準化 [9] が行われており、すでに日本語ドメイン名として利用可能である。

IDN では、マッチングルール/生成時のマップ、正規化が行われており PKIX/LDAP とほぼ同様の処理が行われている。これらの処理は、RFC 3491 Nameprep: A StringPrep Profile for Internationalized Domain Names (IDN)[11] に定められたプロファイルをもとに行われている。 3.3.1 Nameprep 本節ではIDN で用いられている Nameprep[13] に関して解説する。 Nameprep は、RFC 3454 に対して DNS で用いられるホスト名/ドメイン名を表現することに特化したプロファイルとなっている。 3.3.1.1 Nameprep におけるマッピング処理 Nameprep におけるマッピング処理は、RFC 3454 の Table B.1/B.2 のマッピングを行う。 3. Mapping

This profile specifies mapping using the following tables from [STRINGPREP]:

Table B.1 Table B.2

(16)

3.3.1.2 Nameprep における正規化処理

Nameprep における正規化処理は、StringPrep で説明されている NFKC を用いて正規化される。

4. Normalization

This profile specifies using Unicode normalization form KC, as described in [STRINGPREP].

RFC 3491 P.2 4. Normalization より転載

3.3.1.3 Nameprep における出力禁止文字の扱い

Nameprep における出力禁止文字は、StringPrep の Table C.1.2 / C.2.2 / C.3 / C.4 / C.5 / C.6 / C.7 / C.8 / C.9 に定義されているものを出力禁止文字として扱う。また、IDNA(RFC 3490: Internationalizing Domain Names in Applications)[11] においても定義されている出力禁止文字もあるとの注記がある。

5. Prohibited Output

This profile specifies prohibiting using the following tables from [STRINGPREP]: Table C.1.2 Table C.2.2 Table C.3 Table C.4 Table C.5 Table C.6 Table C.7 Table C.8 Table C.9

IMPORTANT NOTE: This profile MUST be used with the IDNA protocol. The IDNA protocol has additional prohibitions that are checked outside of this profile.

(17)

RFC 3491 P.3 5. Prohibited Output より転載

3.3.1.4 Nameprep における双方向文字列の扱い

Nameprep における双方向文字列の扱いは、StringPrep の第 6 章に定義されている方法でチェックをすることを定めている。

6. Bidirectional characters

This profile specifies checking bidirectional strings as described in [STRINGPREP] section 6.

RFC 3491 P.3 6. Bidirectional characters より転載

3.3.1.5 Nameprep における Unassigned Code Points の扱い

IDNA において Unassigned Code Points が使われている場合の Nameprep におけるUnassigned Code Points の扱いは、StringPrep の Table A.1 に示されているUnassigned Code Points の表にしたがって処理する。

7. Unassigned Code Points in Internationalized Domain Names

If the processing in [IDNA] specifies that a list of unassigned code points be used, the system uses table A.1 from [STRINGPREP] as its list of unassigned code points.

RFC 3491 P.3 7. Unassigned Code Points in Internationalized Domain Names より転載

3.4 セキュリティエリアでの標準化

IETF/SAAG9_{では、国際化の一環としてパスワード/ユーザ名に関する標準化が}

進みつつある。(とりあえずは SSH WG で議論が進んでいる)。

(18)

最近のSSH WG/SASL WG での議論を紹介する。 3.4.1 SSH WG における標準化

SSH WG10_{において、SSH のユーザ情報(いわゆるユーザ名)および認証情報(ク}

レデンシャル/パスワードなど)を UTF-8 で送る必要性があり、標準化が試みられている。

これは、もともとはSun Microsystems の Solaris 8 や Microsoft Windows XP においてOpenSSH11_{の国際化を行った際にユーザ名が}_{UTF-8 でエンコードされ}

ているケースがあり、その処理をどうするかが問題となりつつあるためである。 3.4.2 SASL WG における標準化

SASL WG12_では、_{SSH WG と同様にユーザ認証においてユーザ情報/認証情報}

の国際化としてUTF-8 による標準化が進められている。

この枠組みとして、SASLprep (RFC 4013 SASLprep: StringPrep Profile for User Names and Passwords)[14] と呼ばれるものが定義されている。

SASLprep は、RFC 3454 [3] に対してユーザ情報/認証情報に対して特化したものであり、nameprep[13] などと同様な位置づけである。 SSH WG におけるユーザ情報/認証情報は、この SASLprep を用いて正規化を行う様になりそうである。 3.4.3 SASLprep 本節ではSASL WG で制定が行われ、SSH で利用が検討されている SASLprep[14] に関して解説する。前節でも述べたようにSASLprep は、StringPrep に対してユーザ情報･認証情報に対して特化したものである。SASLprep は RFC 4013 として 2005 年の 2 月に制定されたものである。 3.4.3.1 SASLprep におけるマッピング処理 SASLprep におけるマッピングでは、以下のように non-ASCII の空白文字は ASCII の空白文字にマッピングしてもかまわない。また、StringPrep の B.1 に指

10_{Secure Shell Working Group} 11_{http://www.openssh.org/}

(19)

定されている”commonly-mapped-to-nothing”は nothing にマッピングすることができる(削除してもかまわない)。

2.1. Mapping

This profile specifies:

- non-ASCII space characters [StringPrep, C.1.2] that can be mapped to SPACE (U+0020), and

- the "commonly mapped to nothing" characters [StringPrep, B.1] that can be mapped to nothing.

RFC 4013 P.2 2.1. Mapping より転載

3.4.3.2 SASLprep における正規化処理

正規化にStringPrep 同様に UAX15[12] で定義されている NFKC を使って正規化を行う。

2.2. Normalization

This profile specifies using Unicode normalization form KC, as described in Section 4 of [StringPrep].

RFC 4013 P.2 2.2. Normalization より転載

3.4.3.3 SASLprep における出力禁止文字の処理

出力禁止文字に関しては、以下の文字に関して入力を禁止している。 1. Non-ASCII スペース(StringPrep C.1.2)

2. ASCII のコントロールキャラクタ(StringPrep C.2.1) 3. Non- ASCII のコントロールキャラクタ(StringPrep C.2.2) 4. プライベート文字(StringPrep C.3)

5. Non-character code point(StringPrep C.4) 6. サロゲート文字コードポイント(StringPrep C.5)

7. プレインテキストとして適切でない文字(StringPrep C.6) 8. 正規化表現にそぐわない文字(StringPrep C.7)

(20)

9. display properties or deprecated characters(StringPrep C.8) 10. Tagging 文字(StringPrep C.9)

2.3. Prohibited Output

This profile specifies the following characters as prohibited input: - Non-ASCII space characters [StringPrep, C.1.2]

- ASCII control characters [StringPrep, C.2.1] - Non-ASCII control characters [StringPrep, C.2.2] - Private Use characters [StringPrep, C.3]

- Non-character code points [StringPrep, C.4] - Surrogate code points [StringPrep, C.5]

- Inappropriate for plain text characters [StringPrep, C.6] - Inappropriate for canonical representation characters [StringPrep, C.7]

- Change display properties or deprecated characters [StringPrep, C.8]

- Tagging characters [StringPrep, C.9]

RFC 4013 P.3 2.3. Prohibited Output より転載

3.4.3.4 SASLprep における双方向文字列の処理

双方向文字列に関しては、StringPrep の 6 章に述べられている方法でチェックすることが求められている。

2.4. Bidirectional Characters

This profile specifies checking bidirectional strings as described in [StringPrep, Section 6].

(21)

3.4.3.5 SASLprep における Unassigned Code Point の扱い

Unassigned Code Point に関していえば、StringPrep の A.1 で指定されているものをUnassigned Code Point として継承している。

2.5. Unassigned Code Points

This profile specifies the [StringPrep, A.1] table as its list of unassigned code points.

RFC 4013 P.3 2.5. Unassigned Code Points より転載

3.5 PKIX WG での標準化 PKIX WG では、インターネットにおける X.509 の利用の促進を目指し標準化を行っているがこの中で、証明書/CRL に保持される情報に関して国際化の観点で複数バイト文字セットの扱いが課題となっている。 X.509 は、前節で述べたように ISO/IEC/ITU-T の 3 者が共同で標準化を進めてきた。IETF では、セキュリティエリアの下で PKIX WG が、インターネットで PKI を使うための証明書および CRL のプロファイルの標準化を進めてきた。その成果としてRFC 2459[7] が制定され、後に RFC 3280[8] に改訂された。2005 年 2 月現在、RFC 3280 の後継の I-D RFC 3280bis[9] が作成されつつある。これらの規格の関係を図 4 に示す。

(22)

X.509 3rd Edition

1997

X.509 4th Edition

2000

RFC 2459

1999

RFC 3280

2002.5

RFC 3279

2002.5

ＩＴＵ International Telecommunication Union

IETF

Internet Engineering Task Force

X.509 5th Edition??

RFC 3280bis (Son of 3280)

X.509 3rd Edition

1997

X.509 4th Edition

2000

RFC 2459

1999

RFC 3280

2002.5

RFC 3279

2002.5

ＩＴＵ International Telecommunication Union

IETF

Internet Engineering Task Force

X.509 5th Edition??

RFC 3280bis (Son of 3280) 図 4 証明書の規格の関連図 3.5.1 RFC 2459/3280 における証明書の国際化 RFC 2459[7] / RFC 3280[8] では、インターネットで PKI を使うための証明書およびCRL のプロファイルを定めるのが目的となっている。この中で特に複数バイト文字列セットに関しては、Issuer/Subject の記述に注目すべき点がいくつかある。第一にDirectoryName に関して TeletexString/UniversalString に関しては「バックワードコンパチブルのためにのみ利用可」という規定があり、PrintableString / IA5String / BMPString のいずれかでエンコードすることを規定している。 The TeletexString and UniversalString are included for backward compatibility, and SHOULD NOT be used for certificates for new subjects. However, these types MAY be used in certificates where the name was previously established. Certificate users SHOULD be prepared to receive certificates with these types.

(23)

第二にRFC 2459/RFC 3280 において 2004/1/1 以降に発行されるすべての証明書は、Subject/Issuer は UTF8String にてエンコードしなければいけないとの記述がある。この記述は、RFC 3280 の Internet-Draft として 1998 年の 7 月 28 日に発行された draft-ietf-pkix-ipki-part1-09.txt に初めて記述されたものである。 The DirectoryString type is defined as a choice of PrintableString,

TeletexString, BMPString, UTF8String, and UniversalString. The UTF8String encoding [RFC 2279] is the preferred encoding, and all certificates issued after December 31, 2003 MUST use the UTF8String encoding of DirectoryString (except as noted below).

RFC 3280 P.19 4.1.2.4 Issuer より転載この記述により、証明書内の国際化文字列の扱いとしてUTF-8 を使うことが推奨されていると理解すべきである。ただし、2003 年 12 月 31 日以降に発行する証明書はすべてUTF-8 でエンコードすべき、という部分に関して言えば、守られているわけではない。 Until that date, conforming CAs MUST choose from the following options when creating a distinguished name, including their own:

(a) if the character set is sufficient, the string MAY be represented as a PrintableString;

(b) failing (a), if the BMPString character set is sufficient the string MAY be represented as a BMPString; and

(c) failing (a) and (b), the string MUST be represented as a UTF8String. If (a) or (b) is satisfied, the CA MAY still choose to represent the string as a UTF8String.

RFC 3280 P.19 4.1.2.4 Issuer より転載

2003 年 12 月 31 日までは、上記の部分に書かれているように表現できる最小限のキャラクタセット(PKI においては、PrintableString < BMPString <

(24)

UTF8String の順でキャラクタセットを選択することが推奨されている)を用いてエンコードすることが推奨されている。

Exceptions to the December 31, 2003 UTF8 encoding requirements are as follows:

(a) CAs MAY issue "name rollover" certificates to support an orderly migration to UTF8String encoding. Such certificates would include the CA's UTF8String encoded name as issuer and and the old name encoding as subject, or vice-versa.

(b) As stated in section 4.1.2.6, the subject field MUST be populated with a non-empty distinguished name matching the contents of the issuer field in all certificates issued by the subject CA regardless of encoding.

RFC 3280 P.19 4.1.2.4 Issuer より転載

また、証明書のパス構築時にSubject/Issuer の比較を行う際に、現行の PrintableString が文字列比較に関して case-in-sensitive なのに対して

UTF8String における文字列比較が case-sensitive となる記述があり、従来との互換性に問題を起こす可能性を秘めている。

そのため、Steve Hanna 氏(現 Funk Software、元 Sun Microsystems )と Paul Hoffman 氏(VPN Consortium)の両名が International Strings in Certificate というI-D を発行した。この I-D は、現在は RFC 3280 の後継の RFC である I-D でる 3280bis(2005 年 2 月 18 日) [9] に統合された。 CA 証明書の変更のために新旧両方の証明書により CRL を発行する可能性がある。過渡期においては同一の認証局を表現するためにPrintableString を用いて DirectoryName をエンコードしている CA 証明書と、UTF8String を用いて DirectoryName をエンコードしている CA 証明書が存在するため、認証局が発行するCRL が一時的に二つ存在する可能性がある。この場合、PrintableString / UTF8String の比較ルールを知らないクライアントは、新旧二つのエンコードに則したCRL を適切に選択しないと証明書の有効性検証が正常に行われなくなる可能性がある。

(25)

そのため、PKIX-WG にて CRLAIA と呼ばれる I-D(Internet X.509 Public Key Infrastructure Authority Information Access CRL Extension,

draft-ietf-pkix-crlaia-00.txt)[10] が議論されている。この I-D は、CRL 内に証明書のAIA(Authority Information Access)拡張を入れることを提案しているものであり、CRL からその CRL を発行した認証局の情報を得ることができるようにした拡張である。 3.5.2 I-D 3280bis における証明書の国際化 Internet-Draft 3280bis(draft-ietf-pkix-rfc3280bis-00.txt)[9] には、国際化文字列のための第7 章が設けられており証明書内に国際化文字列を入れる際の注意事項がまとめられている。

DirectoryString に関しては PrintableString と同様で、いわゆる ASCII に関して言えばcase-insensitive で文字列比較を行うことが記述された。

Conforming implementations MUST use the LDAP StringPrep profile, as specified in [LDAPBIS STRPREP], as the basis for comparison of distinguished name attributes of type DirectoryString. Name comparisons MUST be performed as caseIgnoreMatch with white space compression.

Draft-ietf-pkix-rfc3280bis-00.txt P.88 7.1 Internationalized Name in Distinguished Names より転載

4 まとめ

IETF における PKI が用いるキャラクタセットの国際化は、もともとの PKI の枠組みが国際化を意識して行われていたにもかかわらず、複数の国際化の方法を採用していたために国際標準として国際化を行う際に、推進が捗らなかったという側面を持っている。

日本においては、過去にINTAP の Directory WG にて ISO 2022 を用いコード系としてJIS 漢字コードを採用し、TeletexString で T61String ベースでの日本語化のLocalization のプロファイルが定められ、いくつかの製品が発表、出荷されたが国際化規格としては広くは用いられなかった。

現在のPKIX におけるキャラクタセットの扱いは、インターネットでの国際化の流れに従い、

(26)

1. キャラクタコードとして ISO 10646 を採用 2. エンコード方式として UTF-8 の採用

3. StringPrep をベースに国際文字列を仮想化

4. Subject/Issuer など PKI に固有な部分をプロファイルとして RFC 3280bis にて定めるという方針で進められている。これらの方針は、IDN/SASL などで用いられている手法そのままであり、すでに実績がある方法を採用している面で有利である。これらの標準化の文書を適宜、調査し日本語化での問題点を洗い出し、IETF への適切なフィードバックが必要であると考える。

(27)

参考文献

[1] ITU-T Recommendation X.509 (2000) | ISO/IEC 9594-8:2000, "Information Technology - Open Systems Interconnection: The Directory: Authentication Framework," 2001

[2] International Organization for Standardization, "Information Technology - Universal Multiple-octet coded Character Set (UCS)", ISO/IEC Standard 10646, comprised of ISO/IEC 10646-1:2000,

"Information technology -- Universal Multiple-Octet Coded Character Set (UCS) -- Part 1: Architecture and Basic Multilingual Plane", ISO/IEC 10646-2:2001, "Information technology -- Universal

Multiple-Octet Coded Character Set (UCS) -- Part 2: Supplementary Planes" and ISO/IEC 10646- 1:2000/Amd 1:2002, ""Mathematical symbols and other characters"

[3] P. Hoffman, M. Blanchet, "Preparation of Internationalized Strings", RFC 3454, 2002

[4] Alvestrand, H., "IETF Policy on Character Sets and Languages", BCP 18, RFC 2277, January 1998

[5] Freed, N. and J. Postel, "IANA Charset Registration Procedures", BCP 19, RFC 2278, January 1998

[6] F. Yergeau, "UTF-8, a transformation format of ISO 10646", RFC 3629, 2003

[7] Housley, R., Ford, W., Polk, W. and D. Solo, "Internet X.509 Public Key Infrastructure Certificate and CRL Profile", RFC 2459, January 1999 [8] Housley, R., Ford, W., Polk, W. and D. Solo, "Internet X.509 Public Key

Infrastructure Certificate and CRL Profile", RFC 3280, April 2002 [9] D. Cooper, S. Santesson, S. Farrell, S. Boeyen, W. Ford, "Internet

X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile", <draft-ietf-pkix-rfc3280bis-00.txt>, work in

progress, 2005

[10] S. Santesson, R. Housley, " Internet X.509 Public Key Infrastructure Authority Information Access CRL Extension",

<draft-ietf-pkix-crlaia-00.txt>, work in progress, January 2005 [11] P. Faltstrom, P. Hoffman, A. Costello, "Internationalizing Domain

(28)

[12] Mark Davis and Martin Duerst, "Unicode Standard Annex #15: Unicode Normalization Forms", Version 3.2.0,

<http://www.unicode.org/unicode/reports/tr15/tr15-22.html>, March 2002

[13] P. Hoffman, M. Blanchet, "Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)", RFC 3491, March 2003 [14] K. Zeilenga, "SASLprep: Stringprep Profile for User Names and

Passwords", RFC 4013, February 2005

[15] K. Zeilenga, “LDAP: Internationalized String Preparation”,

<draft-ietf-ldapbis-strprep-05.txt>, work in progress, February 2005 [16] K. Zeilenga, “LDAP: String Representation of Distinguished Names”,

<draft-ietf-ldapbis-dn-16.txt>, work in progress, February 2005

[17] M. Smith, T. Howes, “LDAP: String Representation of Search Filters”, <draft-ietf-ldapbis-filter-09.txt>, work in progress, November 2004