Hale's Tarawan Vocabulary (1846)

(collected in 1841)

The Vocabulary

The first known description of Gilbertese was published in 1846, in the Ethnology and Philology volume of the report of the U.S. Exploring Expedition 1838-42, compiled by Horatio (Emmons) Hale.

Hale is relatively well known.1 He was born in 1817 in Newport, New Hampshire, and by the age of 17 had published a vocabulary of Algonquin. At Harvard he gained a reputation as an Amerindian linguist and ethnologist, resulting in his selection, at the age of 20, as the ethnographer and linguist for the US Exploring Expedition, which circumnavigated the globe between 1837 and 1842. Upon his return to the US, he spent the next four years preparing the text, published in Philadelphia in 1846 as Volume 6 of the expedition report. The scope of the work brought him international prominence, including compliments from Robert G. Latham and Max Müller, and it was considered one of the most valuable works of its kind at the time. Although he afterwards returned to his chosen profession of law,2 he continued with his Amerindian studies for the remaining 50 years of his life.3 He is remembered today chiefly for his contributions to the development of Boas,4 whom he counseled.5

In 1841, the exploring expedition was in the South Pacific, charting islands and collecting information about the native populations. Hale's vessel, the Peacock, after visiting islands in what is today known as Tuvalu, spent most of the month of April working northward through the Gilberts, beginning at Tabiteuea, where they stayed from April 3rd through the 9th. Hale's limited intercourse with the Tabeteueans provided the only native sources6 for his Gilbertese data, and his notes on leaving that section of the islands reveal his first impressions of the language:

The articulations are very difficult to catch, owing to the guttural and indistinct pronunciation; hence arose, at first, many singular mistakes in writing them, such as the use of t and even b instead of g; r for n and vice versa; d for r, &c. They had no f, h, j, s, or v. The l, which was once or twice written, was properly an r; p and b are the same sound; as are t and d, k and g. The  ñ is common, and in a few instances the softer nasal sound of the French language was heard. There are several combinations of consonants, as nt,  ñg, mr, nr, &c. Most of the words terminate in vowels, and where the contrary appears, the consonant is pronounced with a sort of prolonged utterance, as though there were a vowel suppressed; which we presume to have been really the case. Diphthongs are numerous, and being pronounced with great quickness, their orthography is sometimes doubtful, - as ui or oi, oo or ou, ae or ai.7

On two of the islands where they subsequently stopped, two beachcombers requesting transport to civilization were taken aboard, and for the month after leaving the Gilberts Hale had daily sessions with these two men.8 One, John Kirby, a Scotsman who had lived on Kuria (one of the three islands in the Abemama group) for three years,

was a man of considerable intelligence, had married the daughter of the principal chief of Kuria, and had the best opportunities for acquiring a knowledge of the language. His acquaintance with its idioms was more extensive than might be inferred from his defective pronunciation...9

The other, Robert Wood (alias Robert Grey) was an Irishman who had lived on Makin, at Butaritari, the northernmost island, for five years. He appears to have had less linguistic aptitude. As Hale reports:

Unfortunately we had no opportunity of communicating with the natives, after taking these men on board, and could not, therefore, make use of their intervention, to acquire a more thorough knowledge of the language than they themselves were able to furnish. Their chief deficiency was in pronunciation... On the other hand, in the words obtained by us directly from the natives, the language being entirely strange, with no common medium of communication, so mistakes, as a matter of course, were made in the meanings of words... The pronunciation, however, of these words was probably determined with considerable exactness...10

Hale's Tarawan Vocabulary (HTV) is an unnumbered, double-column, alphabetical list, occupying 23 pages of the Philology section of the expedition report.11 There are 966 entries,12 each beginning with a capitalized and italicized Tarawan word,13 and sometimes followed by one or more additional italicized words, and indications of their sources. In all, these 966 units of HTV for which I use the term entry, contain 1204 purported Gilbertese forms.14 I use the term Tarawan-word, (T-word), to refer to any of these (1204) forms appearing at the beginnings of each entry. For the first (usually the only) T-word of an entry, the form used for Hale's alphabetization, I use the term headword, (h-word). For any of the (238) additional T-words in an entry, I use the term variant-word, (v-word). Entries which contain v-words will be called multi entries (m-entries), while those containing an h-word only will be designated as single entries (s-entries). To summarize:

HTV (Hale's Tarawan Vocabulary) the section of the Philology chapter of the expedition report titled "Tarawan Vocabulary".
entry any of the 966 major divisions of HTV.
T-word (Tarawan-word) any of the 1204 forms appearing at the beginning of each entry in HTV, representing the Gilbertese vocabulary recorded by Hale; T-words = h-words + v-words.
h-word (headword) any of the 966 T-words by which the entries were alphabetized, numbered.>
v-word (variant-word) any of the additional 238 T-words appearing at the beginning of entries
s-entry (single-entry) any of the (792) entries containing no v-words>
m-entry (multi-entry) any of the (174) entries containing v-words>

In general, most entries are s-entries, with an indication of source, one or more glosses, one or more example sentences or phrases, and occasionally, cross-references to other entries. Hale explains in his introduction to the Vocabulary:

The words marked with an asterisk are those which were obtained directly from the natives, and of which the orthography is therefore more likely to be correct than that of the rest of the vocabulary, furnished by the two interpreters. The words are, in general, given as they were written down, though, in some cases, where the alteration produced by the incorrect pronunciation of the interpreters was evident, the word has been restored to what was plainly the correct form. When the two men differed from one another, or when, from the indistinctness of the pronunciation, a difficulty was experienced in determining the exact pronunciation of a word, the variations have been given in brackets, - sometimes with the initial of the interpreter attached to it, when there is a possibility that the discrepancy may indicate a difference of dialect.15

However, of the 966 entries, three16 appear without glosses, but with an example phrase, and another four17 contain only cross references of the type "see..." or "same as...", with no other material. An additional seven h-words include the designation "(?)"18 and one more has an explanatory respelling of a Grey (source) h-word.19


792 of the entries, 82%, are single-entries, an h-word with no v-words, the normal dictionary-style arrangement. Of these, 492 (62%) are unmarked as to source, and I have designated those in the tables with "H", to indicate the forms for which Hale did not see the need to indicate a source. 195 (20%) of the s-entries are marked "*" (designated "N" in the tables) for native informants, 93 (10%) Grey ("G"), and 9 (1%) Kirby ("K"). None of the s-entries is marked for "Interpreter" ("I"), a designation Hale used only when there was some difference (of which he was unclear) between a v-word and an h-word. In addition, there are four T-words20 in the data marked as "at Makin", "Makin","(M)", and "(M.)", unexplained by Hale. Two of these are designated as native, and so it seems probable that all four were elicited from natives during the stop at Makin, where Grey was picked up. In tabulations, I have included these forms with those of the Native, or marked them with "M". Three of them are s-entries, while the fourth21 is listed as a v-word along with a Hale h-word.


The remaining 174 entries contain among them 238 v-words. The majority of these (118) are listed under Hale h-words. 52 are listed with Native h-words. Only four are solely from Grey and/or Kirby with no Hale or Native h-word. Although one Hale h-word has five interpreter v-words, the majority show one or two v-words.
Source s-entries m-entries entries v-words T-words
Hale (H) 492 118 610 48 658
Native (N) 195 52 247 25 272
Interpreter (I) 0 0 0 112 112
Grey (G) 93 3 96 36 132
Kirby (K) 9 1 10 16 26
Makin (M) 3 0 3 1 4
Total 792 174 966 238 1204


In his description, Hale incorporates the following graphemes:

p 22 t 23 k  ae 24 iou 
bd 25 g 26   áéíóústress
mn 27 ñ 28 Nâêîôûmacron
w,vl, 29 r 30 hçäëïöübreve
 s,z Zàèìò  

As he explains in his description of the general orthography used for all the languages in the report,31 [ç] represents /sh/, [j], /zh/. Thus the digraphs [tç] (ç) and [dj] (Z) represent the affricates /wedge-c/ and /wedge-j/. The [ny] (N) which appears only once, is taken to represent the palatal nasal /ñ/.

As for the vowels, the macron and breve indicate vowel quality rather than length. A dozen or so umlauted vowels appear as well, and two cases of a broad /a/ (å), for /open-o/. In the list and the accompanying orthographic explanation, Hale makes it clear that he expects the consonant system of Gilbertese to be


and that the other sounds are artifacts of interpreter's mispronunciations.32

Gilbertese Phonemic Inventory

The phonemic inventory of modern Gilbertese is virtually what Hale expected, except for the presence of velarized /bw/ and /mw/:
bwb tk   i   u
mwm nñ   e  o
 w r       a  

Length is also a phonemic contrast, with each vowel and nasal showing a corresponding long form.

The Gilbertese labial stop is voiced and unaspirated, making it difficult to distinguish from a /p/, as Hale noted. In addition there is a velarized labial stop, non-distinctive before /o/ and /u/, but sounding like /bw/ before /e/ and /i/. Before /a/ it changes the vowel sound, resulting in what seems like two /a/ vowels, one closer to /æ/ and one to /a/.

Distribution of graphemes by source

The actually distribution of all the graphemes in the corpus is as follows:


















1 Encyclopedia Britannica 11th edition: his mother was known as the probable author of "Mary had a little lamb...", and as the recommender of Thanksgiving, in addition to be a poet.
2 passing the Illinois bar in 1845.
3 Brinton, Daniel G., obituary in The American Anthropologist, Vol X, Jan 1897, pp25-7.
4 Encyclopedia Britannica, 15th Edition.
5 Major works listed in Fred C. Sawyer's introduction to the reprint edition of Vol 6, Gregg Press 1968.
6 Except for the four Makin words mentioned below.
7 From the Orthography article in the Philology Chapter.
8 Chapter III, Manners and customs of the Kingsmill Islanders, p 79 "...the opportunity Mr. Hale enjoyed was the very best for eliciting information, as (the interpreters) were upwards of a month on board the ship, and were under examination day after day."
9 Hale's introduction to the grammar.
10 Hale's introduction to the grammar.
11 pp446-468.
12 I have numbered these sequentially to simplify reference. A photocopy of the original Hale list with the numbers penned in, as well as a numbered printout of the list as it appears in my data is in the Appendix.
13 sometimes preceded by the "*" designating native source.
14 not including example sentences.
15 Hale's note to the vocabulary.
16 Entries 324, 580, 799.
17 Entries 20, 379, 659, 829.
18 Entries 19, 123, 138, 252, 344, 591, 875.
19 Entry 41 "(for te anti)."
20 Entries 436, 556, 724, 833.
21 Entry 724.
22 "No distinction is made between b and p, and the following words might have been inserted under the former letter. We have preferred, however, to retain them as they were originally written."
23 "It has been sometimes impossible to decide whether this letter, when initial, properly belongs to the word, or is the prefixed article te."
24 "By both the interpreters, the long e: was frequently substituted for the diphthong ai, and the short ë for a."
25 "The letter d is of comparatively rare occurrence, and seems, in all cases, to be a softened sound of the T."
26 "This letter is merely a softer pronunciation of the k, which probably might in all cases be substituted for it with propriety."
27 "The interpreters sometimes interchanged this letter with r, as, bairi and baini, nose, mariap and maniap, council-house. In some instances this was probably agreeable to the practice of the natives, as we find the l and r of many Polynesian and Malay words changed in Tarawan to n; as, nango, fly, for lango, - nako, to come, for lako, &c. We heard, also, at Taputeouea, kane:pu and kare:p', umane and umare, &c."
28 "Both Kirby and Grey had much difficulty in pronouncing this element; the first frequently substituted for it an n, and the latter a k."
29 "This letter does not properly belong to the language. In a very few words it was used by the interpreters, probably for some other letter, - r, n, or t."
30 "This letter was occasionally confounded with n and d."
31 "Alphabet" explanation.
32 "In the pronunciation of Grey and Kirby the sounds x, j, c, s and v were heard, but they seemed, in all cases, to be corruptions of other elemetts, as, k, t, and w. If the language should, hereafter, be studied on the spot by a competent person, the number of consonant sounds will probaly be reduced to eight, viz: - k, m, n, ng, p, r, t, w, being (with the exception of the ng) the same elements as are found in the Hawaiian."