/usr/share/doc/enca/README.devel is in enca 1.13-4.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 | #============================================================================
# Enca v1.13 (2010-02-09) guess and convert encoding of text files
# Copyright (C) 2000-2003 David Necas (Yeti) <yeti@physics.muni.cz>
# Copyright (C) 2009-2010 Michal Cihar <michal@cihar.com>
#============================================================================
Contents
0. Developing programs utilizing libenca
1. How to add a new charset/encoding to libenca
2. How to add a new surface to libenca
3. How to add a new language to libenca
4. Automake, autoconf, libtool, ... note
0. Developing programs utilizing libenca
****************************************
* Look at libenca API documentation in devel-docs/html.
* Look into enca source how it uses libenca.
Note enca is quite a simple application (practically all libenca
interaction is in src/enca.c). It's single-threaded and uses one
language and one analyser all the time. Provided each thread has its own
analyser, libenca should be thread-safe (untested).
* Take names starting with ENCA, Enca, enca, _ENCA, _Enca, and _enca
as reserved.
* pkgconfig is supported, you can use PKG_CHECK_MODULES to check for libenca
in your configure scripts
1. How to add a new charset/encoding
************************************
(optional steps are marked `[optional]'):
iconvcap.c:
* Add a new test (even if you are 100% sure iconv will never support it),
please see top of iconvcap.c for some documentation how it works.
tools/encodings.dat:
* Add a new entry.
* Use @ICONV_NAME_<name>@ (as it will appear in iconvcap output) for
iconv names.
tools/iconvenc.null:
* Add it (with NULL)
Specifically, for regular 8bit (language dependent) charsets:
lib/unicodemap.c:
* Add a new map to Unicode (UCS-2) unicode_map_...[].
* Add a new UNICODE_MAP[] entry.
lib/filters.c: [optional]
* Create a new filter or make an alias of an existing filter.
lib/lang_??.c:
* Add the new encoding to some existing language(s).
* Add appropriate filters or hooks [optional].
data/maps/??.map:
* Add a new map to Unicode (UCS-2)
Specifically, for multibyte encodings:
lib/multibyte.c:
* Create a new check function.
* Put it into appropriate ascii/8bit/binary test group
ENCA_MULTIBYTE_TESTS_ASCII[], ENCA_MULTIBYTE_TESTS_8BIT[],
ENCA_MULTIBYTE_TESTS_BINARY[].
* Put strict tests (i.e. test which may fail) first, looks-like tests
last.
2. How to add a new surface
***************************
* Try to ask the author what to do, since this may be complicated, or
* Hack, basically it must be added to lib/enca.h EncaSurface enum,
to lib/encnames.c SURFACE_INFO[] a detection method must be added to
lib/guess.c and now the most complicated part: this new method must be
used ``in the right places'' in lib/guess.c make_guess().
3. How to add a new language
****************************
Create a new language file:
* Create new lib/lang_....c files by copying some existing (use locale code
for names)
* Fill all encoding and occurence data, create filters and hooks (see
filters.c too). You can do it manually, but look how it's done for
existing languages in data/* and read data/README.
lib/internal.h:
* Add new ENCA_LANGUAGE_....
src/lang.c:
* Add a new LANGUAGE_LIST[] entry pointing to the ENCA_LANGUAGE_....
4. Automake, autoconf, libtool, ... note
****************************************
If you run ./autogen.sh and it finishes OK, you are lucky and can expect
things to work.
You have to give --enable-maintainer-mode to ./configure (or ./autogen) to
build dists and/or the strange stuff in tools/, data/, tests/, and
devel-docs/.
|