LinuxSir.cn,穿越时空的Linuxsir!

 找回密码
 注册
搜索
热搜: shell linux mysql
查看: 766|回复: 5

编译器对C文件的编码有要求吗?

[复制链接]
发表于 2004-10-18 17:58:51 | 显示全部楼层 |阅读模式
我把文件都存为了utf-8格式的,会不会造成读错误?

另外,emacs打开有些c或者h文件时,如果开头为“/**/”就是注释行,会出现“/035”(好像是这个)之类的莫名字符
发表于 2004-10-18 18:34:35 | 显示全部楼层
没有吧。C 标准中规定了下面这些基本字符:
  1. 拉丁字母表中的 26 个大写字母
  2. A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
  3. 拉丁字母表中的 26 个小写字母
  4. a b c d e f g h i j k l m n o p q r s t u v w x y z
  5. 10 个十进制数字
  6. 0 1 2 3 4 5 6 7 8 9
  7. 和下面的 29 个图形字符
  8. ! " # % & ' ( ) * + , - . / : ; < = > ? [ \ ] ^ _ { | } ~
复制代码
这些基本字符可以使用任何编码,并被看成单字节处理。
宽字符的处理比较复杂,你可以参考 ISO/IEC 9899:1999(E), 5.2.1.2。
发表于 2004-10-19 09:47:22 | 显示全部楼层
估计直接采用unicode内部编码作成的文件不可以。
utf8和gbk等外部编码应该没问题。
发表于 2004-10-19 12:27:03 | 显示全部楼层
下面是 C 标准中的对应宽字符的内容:
  1. 5.2.1.2 Multibyte characters
  2. The source character set may contain multibyte characters, used to
  3. represent members of the extended character set. The execution
  4. character set may also contain multibyte characters, which need not
  5. have the same encoding as for the source character set. For both
  6. character sets, the following shall hold:
  7.     - The basic character set shall be present and each character shall be
  8. encoded as a single byte.
  9.     - The presence, meaning, and representation of any additional members
  10. is localespecific.
  11.     - A multibyte character set may have a state-dependent encoding,
  12. wherein each sequence of multibyte characters begins in an initial shift
  13. state and enters other locale-specific shift states when specific multibyte
  14. characters are encountered in the sequence. While in the initial shift
  15. state, all single-byte characters retain their usual interpretation and
  16. do not alter the shift state. The interpretation for subsequent bytes
  17. in the sequence is a function of the current shift state.
  18.     - A byte with all bits zero shall be interpreted as a null character
  19. independent of shift state.
  20.     - A byte with all bits zero shall not occur in the second or subsequent
  21. bytes of a multibyte character.
  22. For source files, the following shall hold:
  23.     - An identifier, comment, string literal, character constant, or header
  24. name shall begin and end in the initial shift state.
  25.     - An identifier, comment, string literal, character constant, or header
  26. name shall consist of a sequence of valid multibyte characters.
复制代码
发表于 2004-10-19 12:33:30 | 显示全部楼层
现在较新版本的发行版都将utf-8作统一编码,当然会支持的。
 楼主| 发表于 2004-10-19 17:09:00 | 显示全部楼层
thx for all
您需要登录后才可以回帖 登录 | 注册

本版积分规则

快速回复 返回顶部 返回列表