Revival of the Blog

As will have been apparent I’m not really a blogger: lexicographic work doesn’t have frequent exciting leaps forward — hence Sam Johnson’s ‘Harmleſs Drudge’ quip. But I am now trying to make the distinction between this blog and the technical website where I make available the dictionaries and some of my related linguistic studies : Aardvarks Lexico. A schematic diagram summarising this long, rambling post can be seen here.

I have reached a pause-point in the editing of the Mampruli Dictionary and this seems a stage where I might put the the whole enterprise in context.

When I went to Ghana Joe Grimes was leading a workshop on Discourse, based on his book ‘The Thread of Discourse’ Thread of Discourse.

While in Ghana Grimes was also collecting word-lists of vocabulary in the Ghana languages in connection with an ahead-of-its-time project for a comparative database of the languages of the world. The plan was to use the data from Summer Institute of Linguistics researchers round the world and the computer facility of the University of Oklahoma (remember that at that time ‘a computer’ filled a 4-storey building and had about the computing power of the non-smart phone in your pocket). Joe was working on algorithms to make it possible to compare vocabulary items by phonetic template, grammatical function and/or meaning, pairwise or in subsets of languages.
After he left Grimes wrote (no phones or e-mails in those days, folks!) to ask me to collect more and better wordlists of the Ghana languages as the samples available off-the-peg in 1973 were very tentative preliminary survey data.
Having started collecting, I then heard that the project fell through for lack of funding, but I had already invested some time and got hooked on the interest of comparative lexicography.

Meanwhile we had received our primary assignment, to study and help in the development of the Mampruli language (1974) and settled in the village of Gbeduuri in the Northern Region of Ghana.

The state of knowledge before this process started can be seen in Swadesh, Mauricio /Evangelina Arana/John T.Bendor-Samuel/W.A.A.Wilson 1966. A preliminary glottochronology of Gur languages. J.W.A.L. III (1): pgs 27-65 see: facsimile page
In 1975 we received a book by Gabriel Manessy, the French scholar who was the main expert on the Gur group of languages to which most of those we were working on in northern Ghana belong. He also had wordlists (though not much new data since Swadesh) : Manessy, Gabriel 1975. Les langues Oti-Volta. Paris : SELAF see: facsimile page

Manessy also suggested historical developments and relationships of the languages and reconstructed ancestral forms as etymological formulae, see : facsimile page

On the basis of these earlier studies, I made a focus on the Western Oti-Volta subgroup of these languages which includes Mampruli and many of the neighbouring tongues. My main field dictionary of Mampruli was built up as I and my colleagues (primarily Margaret Langdon in 1974-6 and Tony Pope from 1977-82) worked on the language, on 6 x 4″ light cards, written in pencil and filed in the then-iconic shoeboxes, see facsimile card . In odd moments of time I was also collecting entries into a series of mini slip-files, paper slips 3 x 2″. (Arthur Hokett, neighbouring Texan missionary of the Assemblies of God Church, had off-cut strips from reducing the USA-sized paper he had brought from home to the imperial sizes then standard in Ghana). You can see instances of these little slips : facsimile slips .

The Western Oti-Volta languages (W.O/V) that I aimed to cover were (grouped approximately be closeness of relationship, not by alphabetic or geographical criteria):

KaMara, Hanga, Mampruli, Dagbani, Nanun, Talni, Nabit, Kusaal (Toende), Kusaal (Agole), Mõõré, Nõõtré, Farefare, Ninkããré, Waali, Dagaari, Birifor, Safaliba. I also collected data in closely-related Oti-Volta language Buli and, when it was ‘discovered’, Kɔnni . At an even later stage Kantoosi was added to the list of W.O/V (at the beginning of the above list).

The concept which I developed as a goal was a comparative dictionary of these languages, primarily keyed to a concept and arranged semantically – thus an entry showing the word for ‘sun’ in all the languages would be grouped with an entry for ‘moon’ and one for ‘star’, and so on. Reference to individual concepts would be facilitated by an alphabetic index. I have made a sample entry to show the idea. For the semantic keys I developed a Thesaurus based on work by Philip Hewer and originally published as a hard-copy booklet of keys referenced by English words and arranged in a semantic framework with letter and number indices. This looked like facsimile Thesaurus page. The items and thesaurus structure were partly based on our understanding of the vocabulary of the north-Ghanaian languages we were working on, and the hope was that with colleagues completing thesauri in the various languages the ‘emic’ nature of the schema could be refined (unfortunately this did not materialise through lack of take-up). As a comparative dictionary this approach would detect related (‘cognate’) words where the relationship was obscured in alphabetical wordlists because of slight semantic change in one language or another, or even just different glossing-choices by the investigators.
In order to publicise the project and try to get data and collaboration from colleagues, I started producing ‘Lexinotes’ which exemplified, with commentary, entries for a group of concepts (such as words for ‘water-features’ [river, lake, well, swamp …]). I kept swinging to and fro between taking one word and finding it in wordlists of all the languages one by one, and compiling the comparative dictionary from the wordlists and extracting the individual entries ready-formed from that. Because progress on the different languages was patchy and time limited, I tended to use the first alternative for the Lexinotes and similar studies, but later on changed focus to the second option. The Lexinotes were duplicated piecemeal in a limited edition and mostly do not survive. Some digitalised examples can be seen on the website.
At the end of the 1970s Manessy published a comparative study of all the Central Gur languages [Manessy, Gabriel 1979. Contribution à la classification généalogique des langues voltaïques : – le proto-central. Paris : SELAF] . Over the next 30 years in dribs and drabs I made a manuscript compilation of all Manessy’s proto-forms, then a digital database, added my own less-firmly-based summary forms where Manessy’s data were inadequate, and finally a database keyed to the thesaurus headings from which relevant ‘etymologies’ could be added to the entries in the dictionaries of the individual languages and the comparative dictionary. An example can be seen in the comparative sample entry (in green).

A major change in all of our lives came in the 1990s with the introduction of computers and digital processing. On the one hand this offered very powerful help to lexicographical work, in terms of making copies and backups of materials, safe from fire, flood and termites, and in facilitating alphabetisation, searching, cross-linking and other valuable tools. On the other hand for projects like mine which were already somehow advanced in manuscript form there was the prerequisite of keyboarding large swathes of data, with no advances possible until it was done.

In my case I am still transcribing handwritten slips and cards from the best part of forty years ago.

Also the rapid advances of computing meant that existing work had constantly to be updated and converted for new computer systems, applications, storage media and protocols.

In order to maintain continuity/compatibility with material from earlier phases of the project I have to maintain formats which were designed to cope with the inability of earlier hardware (processing, cacheing and file-storage limitations) and software (earlier applications or versions of applications) to handle multi-word items or items containing (certain sorts of) punctuation in sorting and linking tasks.
The Unicode standard is an immense leap forward for all work in world languages, but again I am still finding myself referring to files which need to be converted from legacy ASCII work-arounds for the representation of the orthographies and phonetics of Ghanaian languages. The sample comparative dictionary entry drew some data from an earlier attempt which turned out to be in ‘Ghana Doulos’ ASCII font and needed manual conversion to integrate with the entry from the final (hopefully) database which I am currently using [whoops! just discovered there is already a unicode-converted version – such is life in this disorganised project!].

In this Brave New World the aim was to make lexical databases with the SIL ‘SHOEBOX’ application which is widely-used for this sort of work. The minimalist markup means that the data files can be read and edited in a plain text editor if need be. The programs designed to work with them allow the ordinary user to specify the display formatting, while geeks can readily make scripts to convert the basic form to other markup systems where required.

Such time as was available for the dictionaries work (translation and publishing of Mampruli Scripture – first Luke’s Gospel, then Genesis-Exodus 20, finally the New Testament (2001) – was mainly devoted to keyboarding and converting manuscript texts and wordlists to usable digital forms.

During the transition period from primary involvement with the Mampruli Project to working as a translation consultant for a number of the Ghana languages, a defining moment was August 2004. I was in Tamale having moved out of Gbeduuri and not yet being able to move into our new residence near Techiman. At this time

I received the TOOLBOX program, the new incarnation of SHOEBOX.
The final version of the Dagbani Dictionary (as a table in MS WORD) was passed on to me.
I was urged to carry the Dagbani lexicographic project further.

By the time I had assimilated and converted to these new materials and goals, the Dagbani Bible and Fr. Kofi Ron Lange’s collection of Dagbani Proverbs were published (both 2006) and I had access to the electronic files of both, partly with a view to noting possible corrections for future editions.

Basically, from 2004 to 2014 I was engaged on the Dagbani:
- working through the Dagbani Bible and inserting examples from the text into the entries in the TOOLBOX dictionary database.
- adding thesaurus keys, lexical functions and other missing fields when an example was inserted in a record.
- refining the entry structures to make as easy as possible the entry and retrieval of the content that I felt was important. These principles were then intermittently used for other dictionary databases that came my way, most particularly Agole Kusaal, where I was acting as consultant to the Old Testament Translation team from 2001 to 2013.
- One aim which I have not yet fully implement was to produce a tailored set of ‘lexical functions’ less complex than those of Igor Meɫčuk and his collaborators, but more suited to my needs that those provided by the Multi-Dictionary Formatter of David Coward and Chuck Grimes [son of the Joe E. with whom this saga started] which works with the TOOLBOX program.
- During the latter part of this period I worked through the whole Dagbani database in alphabetical order bringing all the entries into uniformity with the finally-adopted format.
With one or two side-excursions (producing a restricted-entry Mõõré dictionary in 2012, for instance) I completed this process by the end of 2014 and consider my involvement with the Dagbani effectively completed. The dictionary is published on the website.

*REMINDER :: A schematic diagram summarising this long, rambling post can be seen here.

Aardvarks Mampruli

A series of blogs on Lexicography and stuff

Revival of the Blog

One thought on “Revival of the Blog”

Leave a comment Cancel reply

Share this:

Related

One thought on “Revival of the Blog”

Leave a comment Cancel reply