Home |
Introduction |
Narrative |
Database |
Texts |
eLibrary |
Audio-visuals |
Indices |
I give here the definition for file format and for text encoding as provided in the original 1987 disk. The content is the same (and is given in black), since it is still applicable, but it has been updated for browser display. -E-G ----- ENCODING MANUAL: TEXTS, GRAPHEMIC FORMAT (March 16, 1987) ---------------------------------------------------------- Level 1 : intralinear 1a: column format. Columns are indicated by Arabic numerals to conform to the published edition of the text. Should a column be be omitted in the published edition, this column will receive an "a" (e.g. 3a refers to the column following 3 omitted in column numeration in the published edition of the text). Should a break in the text precludes the clear delimitation of columns, the designation ?+ will be utilized to reflect the unknown number of columns. When column sequence is to be inverted, column numeration will continue sequentially, and an ! will follow the column number. The front and back of a tablet are designated as: r front v back Writing on the edges of a tablet is designated as: le left edge re right edge The condition of a column may be further specified as: cb1 the beginning of the column is broken cb2 the end of the column is broken cb3 the entire column is broken cb4 break within column of indeterminate length cb99 an unknown number of columns destroyed The designations for broken columns are used only when number of columns cannot be profitably approximated. ce1 the beginning of the column is blank ce2 the end of the column is blank ce3 the entire column is blank ce4 blank space of indeterminate length in the middle of the column cr1 beginning of column erased cr2 end of column erased cr3 entire column erased cr4 an erasure of indeterminate length in the middle of the column 1b: line format. Each line must be provided with published line number. If none exists, one will be provided. A 0 number may be utilized at the beginning of a column to provide further information about the text. A prime number (e.g. 1') will be used when a break in the text precludes sequential numbering from top to bottom. 1c: graphic unit, graphic condition, graphemic value letters: a b d ... for phonemic values A B D ... for logograms *A *B *D ... for unknown readings (i.e. signs which are identifiable, but whose reading in context is uncertain) ' ... aleph c ... ayin s` ... sade s^ ... shin t` ... tet t^ ... tha h ... khet q ... qof : ... "Glossenkeil" Numerals: Arabic numerals are used exclusively, and are written so as to preserve the manner in which they appear graphically in the text. A hyphen separates the groups of units, tens and sixty- signs. For example, the number 94 would be transliterated as 60-30-4; 60 represents one sign, 30 represents three tens and 4 represents four units. Optionally, in cases of textual ambiguity, an abbreviation may be added to render more explicitly the form of the sign (whether curviform or wedge), and the graphic orientation of the sign. Curviform signs are defined as those signs formed with the blunt cylindrical end of the stylus. (w) wedge (c) curviform (wh) horizontal wedge-shaped (ch) horizontal curviform-shaped (wv) vertical wedge-shaped (cv) vertical curviform-shaped (ws) slanted wedge-shaped (cs) slanted curviform-shaped Fractions are written as 1/2, 1/4, 1/3. condition of text: X a single unreadable sign N a single unreadable sign representing a numeral ... undefined sequence of broken signs | between two readings designates a partially broken sign whose clear identification is uncertain [] restored [^]^ broken sign or sequence of signs <> added by modern editor <<>> mistakenly written by scribe <<<>>> modern correction of ancient error <>-<<>> reflects the insertion of a sign, and the deletion of another. This is used in cases when a sign is written resembling the intended sign, and is so corrected by the modern editor. ? after sign for uncertain reading ! after sign for abnormal graphic writing !! after sign for divergence from published transliteration !!! after sign designating both abnormal graphic writing and divergence from published transliteration references to sign lists: u0(A) the sign "A" is being read with a value of "u" 10 (GUR) number 10 in the series "GUR" graphic markers: ln1 calligraphic rule ln2 string rule ln3 blank case ln4 blank space at the beginning of a case ln5 blank line at the end of a case vt vertical line rs~A erasure of an identifiable sign "A" When a sequence of identifiable signs is erased, the symbol rs~ will precede each. rs1,2,3 a specification of the approximate number of signs erased rs99 an erasure of an entire line dt1,2,3 indentation with designation of approximate number of signs indented graphic relationships: (carriage return) line boundary (blank) word boundary - sign boundary . intralogographic boundary (e.g. PA.TE for EN5) @+ ligature (e.g. a@+na) @x inclusion (e.g. KA@xME) @^ in front of each sign designates a superscript word (e.g. @^a-@^na) @' in front of a superscript sign @. in front of each sign designates a subscript word (e.g. @.a-@.na) @> in front of graphically small signs @: in front of a graphically small word @< in front of a sign lacking its usual complement of strokes (e.g. @ @; following a "gunu:" sign (e.g. GIR2@;) @| between two or more signs which appear vertically atop each other (e.g. AN@|AN) @# before a sign written upside down (e.g. @#UD) The following designations are used exclusively to clarify graphic relationships of numerals: + used in broken contexts to reflect that the numerals so connected are regarded as a unit (e.g. [5+]3) -: indicates that the number which follows qualifies the preceding sign or sequence of signs (e.g. sa-ha-wa-:2) :- indicates that the number which precedes qualifies the following sign or sequence of signs (e.g. 2 3:-*NI) +: indicates that the numeral is written as a ligature (e.g. IB2+:2) x: indicates that the numeral is written as an inclusion (e.g. IB2x:2) 1e: word level code (i) graphemic level (a) DETERMINATIVES Preposed determinatives are immediately followed by = and the sign boundary - (e.g. DINGIR=-'a3-da; MI2=-is"-tar-um-mi; I=-mu-ka-an-ni-s"i-im). Postposed determinatives are immediately preceded by = which is preceded by the sign boundary designation - (e.g. ib-la-=KI; UD 2-=KAM; 'a3-da-um-=TUG2). (b) PHONETIC COMPLEMENTS Preposed phonetic complements are followed by the symbols =+ and postposed phonetic complements are preceded by the symbol += (e.g. LIM=+*LULIM+=LU; here LIM is a preposed complement while LU is a postposed complement. (ii) non-graphemic level d_ any letter followed by an underscore preceding any single sign. This code applies to either sign or word, hence character used will distinguish between two levels. These categorizations do not reflect written signs, but the interpretation of a sign or unit of signs, particularly for onomastics. These codes may be cumulative. p_ personal name, gender unknown f_ human feminine name F_ divine feminine name m_ human masculine name M_ divine masculine name D_ divine name, gender unknown g_ geographical names n_ other proper names examples: g_ib-la-=KI m_ib-ri2-um M_DINGIR=-'a3-da D_DINGIR=-*NI-da-kul n_E2-DURU5-=KI ITI n_za-lul 1f: embedded note (space) !text of note! (space) This convention is used regularly within this version of the Ebla corpus to direct attention to the broken sign lists compiled by the editors of the "Archivi Reali di Ebla Testi" (ARET) volumes. Should a broken sign or sequence of signs be represented in the tables of the ARET volumes, the embedded note designation will provide the page number upon which the sign(s) appear. For example, !ARET2 p.167! indicates that the broken sign on this particular line is to be found represented in ARET volume 2 page 167. Back to top |