News

Compiling a list of glosses from your glossed examples in a LaTeX document (under UNIX)

If you have many interlinearized examples in your LaTeX documents, you have probably wondered about the best way to handle them. Here are some ideas. There are two potential problems with the glosses: 1) different publishers may have different requirements for how to print them, so transferring glossed examples from one manuscript to another may be difficult. 2) You’ll want to have a list of all the glosses in your document, and it should be complete and consistent. To solve all that, the main strategy is to label all your glosses explicitly as such by using a new command we may call “Gloss”:

\newcommand{\Gloss}[1]{\textsc{#1}}

This command prints all your glosses in small caps. If a publisher requires all-caps instead, you can change this command to:

\newcommand{\Gloss}[1]{\MakeUppercase{#1}}

Then you need to use this command, of course, when interlinearising examples. This might look as follows:

\exg. \dots a mwe \textbf{yur}-yurmiline suku-on nyoo, suku-on ane gyes=an nyoo. \\
  and \Gloss{real} \Gloss{redup}-forget stuff.of-\Gloss{3sg}.\Gloss{poss} \Gloss{3pl} stuff.of-\Gloss{3sg}.\Gloss{poss} \Gloss{tr} work=\Gloss{nmlz} \Gloss{3pl}\\
  \enquote{and he repeatedly forgot his things, his tools for work.}

In order to use this command for your automatic compilation of glosses, it is important to not include any separators in a gloss. As you can see above, separators such as “.-=” are not included in within the wavy brackets of a Gloss argument, but stay outside.

Now open your terminal and type in the following command:

grep 'Gloss' INFILE.tex | tr -s ' .;:\-=()\\' '\n' | grep 'Gloss' | sort -u

I usually use ack instead of grep, but here grep works just fine. Let’s break this down a little: the first “grep” command selects all lines that contain the sequence “Gloss” out of your INFILE.tex document (this command isn’t strictly necessary). The “tr” command translates all occurrences of the characters in the first pair of quotation marks into the newline character “\n”. The next command takes the result of that action and again filters out only those lines that contain the word “Gloss”. And “sort -u” sorts uniquely, that is, it gives you an alphabetical list of your search with duplicates thrown out. The result of this action is displayed at the end of this post.

You can take this list directly and put it into your LaTeX document. If you use the sublime editor you can use multiple cursors to add dashes and semicolons to all entries at the same time. Usually, there will be inconsistencies and typos in your glosses. If you get any open brackets as in “Gloss{xx”, that probably means you included a separator within the brackets and you’ll have to look for those cases. So first use the list to clean up your glosses, run the above command again after each round, and once your list looks perfect, include it in your LaTeX document.

Gloss
Gloss{1du}
Gloss{1excl}
Gloss{1incl}
Gloss{1pl}
Gloss{1sg}
Gloss{1s}
Gloss{2sg}
Gloss{2s}
Gloss{2}
Gloss{3du}
Gloss{3pc}
Gloss{3pl}
Gloss{3sg
Gloss{3sg}
Gloss{3}
Gloss{adv}
Gloss{ad}
Gloss{agr}
Gloss{ana}
Gloss{art}
Gloss{asr}
Gloss{aux}
Gloss{bi}
Gloss{body
Gloss{caus}
Gloss{clf}
Gloss{comp}
Gloss{cond}
Gloss{conj}
Gloss{cons}
Gloss{cont}
Gloss{cop}
Gloss{def}
Gloss{dem}
Gloss{detr}
Gloss{det}
Gloss{disc}
Gloss{dist}
Gloss{dl}
Gloss{dst}
Gloss{es}
Gloss{excl}
Gloss{freq}
Gloss{fut}
Gloss{hab}
Gloss{hesit}
Gloss{impf}
Gloss{incl}
Gloss{incpt}
Gloss{irr
Gloss{irr}
Gloss{it}
Gloss{it}/
Gloss{loc}
Gloss{med}
Gloss{name}
Gloss{nec}
Gloss{neg2}
Gloss{neg}
Gloss{nmlz}
Gloss{np}s
Gloss{num}
Gloss{obj}
Gloss{part}
Gloss{pft}
Gloss{pl}
Gloss{poss1}
Gloss{poss2}
Gloss{poss}
Gloss{pos}
Gloss{pot}
Gloss{pp}
Gloss{prep}
Gloss{prf}
Gloss{prog}
Gloss{prox}
Gloss{prsup}
Gloss{real}
Gloss{recp}
Gloss{redup}
Gloss{res}
Gloss{sbj}

Three deaths and one marriage…

…made for a challenging field trip to Ambrym this year. Still, I was lucky enough to get enough speakers both of Daakaka and Dalkalaen to collect the data we need for MelaTAMP – thanks to the people of Emyotungan and Tio Bang in Port Vila. Now I’m excited to start with the analysis. Stay tuned.

MelaTAMP Workshop in Port Vila

We held a small workshop at the University of the South Pacific yesterday, on Emalus Campus in Port Vila, Vanuatu. The purpose was to introduce and discuss the storyboards we have developed as part of the MelaTAMP project.

We thank Robert Early and Meriani Situ for their organisation and local support, and we’re happy that our audience extended far beyond our few project members and collaborators.

Storyboards for TAM expressions

Scene from a storyboardFor our project, we primarily work with corpus data, but we also have funding to do further field work and elicit contexts that are rare or unattested in the corpora. As our primary method of elicitation, we have decided to use storyboards, which are short scripts accompanied by pictures.

I have created the pictures for our stories in Inkscape. The stories and SVG source files are being made available on our project wiki. The SVG files can simply be customised, just credit the project and me with the original creation.

 

Pretty WALS maps

A pretty map with WALS data, generated by GMT
A pretty map with WALS data, generated by GMT

The World Atlas of Language Structures maps data from typological studies to a world map. In addition to the online version, there is also a program for the local production of maps.

An even prettier map, in SVG format, with the background customised in Inkscape
An even prettier map, in SVG format, with the background customised in Inkscape

However, the options for customisation are limited. I use the free and open command-line tool GMT for the production of linguistic maps. It has awesome tools for all kinds of tasks, including the mapping of symbols from a file of coordinates. Here is a quick guide on how to produce your own pretty WALS map.

  1. Download your data set from WALS in tab-separated values (there is a button just underneath the header). Save it as walsXY.xy, where XY is the WALS feature you want to map.
  2. Remove the metadata lines at the top of the file and the header of the table.
  3. GMT does not distinguish between tabs and other simple blanks. Replace all simple space characters by nothing or a character of your choice.
  4. Start GMT and move to the directory to which you have downloaded your data set and where you want to produce your map.
  5. In the same folder, create a cpt file containing the colors that you want to assign to different values. My wals.cpt file has the following content: (number of WALS value, RGB values).
    1 240/11/0 
    2 0/210/240
    3 240/180/0
    4 28/142/59
    5 28/54/142
    6 90/28/142
    7 211/211/170
    8 0/0/0
  6. Run the following commands in GMT:
    pscoast -R-180/180/-70/80 -JQ7i -K -Ssteelblue > walsXY.ps
    psxy walsXY.xy -R -i5,4,2 -J -O -Sc0.15c -Cwals.cpt >> walsXY.ps
    
  7. For more options, see the documentation of GMT.

The future is what the universe wants

There was a remarkable small workshop on imperatives at ZAS last week that I was happy to be part of. It was a welcome opportunity to take up my work on potential mood directives in Daakaka and their relations to future assertions and embedded clauses. You can see my slides here.

Towards an ontology of modal flavours

modal flavours wordleI’m still feeling warm and fuzzy from the wonderful workshop we had last week at the DGfS conference about modal flavours. The idea for this workshop had formed last year during the SIAS summer institute on the investigation of linguistic meaning, together with Ryan Bochnak and Anne Mucha. We were very happy to win Aynat Rubinstein as our invited speaker and get some excellent submissions from various subfields of linguistics. We gave an overview of background motivations and common themes of the talks in our introduction.

 

Out now: Indefinites in Daakaka (Vanuatu)

There are two indefinite articles in the Oceanic language Daakaka, TUSWA and SWA. Like weak NPIs or unspecific indefinites in many other languages, TUSWA is excluded from positive assertions about the episodic past or present. In this paper, I try to locate them within the cross-linguistic space of indefinites and NPIs and sketch out an approach to account for their differences.

Read the full paper here.