Custom typological maps with R

I used to plot my typological language data to geo-spatial maps with Generic Mapping Tools, which is awesome, and where a simple two-liner will do the job. But I found this hard to use in teaching, since it doesn’t run smoothly on everyone’s operating system. So it’s time for me to move on and learn to do maps with R. There are a few awesome resources out there, including lingtypology, which reads data directly off glottolog.

But I wanted to plot data that is not included in a database, and since it’s actually easy, but not entirely trivial to find on the internet so far, here’s a short tutorial. First of all, here is our little data set, with geo-spatial coordinates for each language, language family and basic word order info. Save this to a text file in your working directory with the name “typology.txt”.

Language,Latitude,Longitude,Family,Family2,Word order
Movima,-13.81,-65.63,Isolate,9,0
Arapaho,43.39,-108.81,Algic,10,OV
Alta,15.69,121.45,Austronesian,12,OV
Savosavo,-9.13,159.81,Isolate,9,OV
Teop,-5.67,154.97,Austronesian,11,VO
Sumi,26,94.42,Sino-Tibetan,12,VO
Yali,-4.08,139.46,Nuclear-Trans-New-Guinea,13,OV
Beja,17.24,36.67,Afro-Asiatic,14,VO
Vera'a,-13.89,167.43,Austronesian,11,OV
Cabecar,9.67,-83.41,Chibchan,15,VO
Urum,42.04,43.99,Turkic,16,OV
Dolgan,71.11,94.29,Turkic,16,OV
Gorwaa,-4.24,35.8,Afro-Asiatic,14,OV
Pnar,24.82,92.26,Austroasiatic,17,0
Goemai,8.74,9.72,Afro-Asiatic,14,VO
English,53,-1,Indo-European,7,VO
Nung,-29.71,19.08,Tuu,18,VO
Bora,-2,-72.26,Boran,19,VSO
Nafsan,-17.7,168.38,Austronesian,11,OV
Komnzo,-8.65,141.52,Yam,20,OV
Kamas,55.07,94.83,Uralic,21,VO
Kakabe,10.6,-11.44,Mande,22,OV
Daakaka,-16.27,168.01,Austronesian,11,VO

Next in your R script, load the following packages. You might have to install them first:

library(ggplot2)
library(maps)
library(sf) #for advanced mapping options
library(viridisLite) #for pretty, color-blind friendly color palettes

Read in your map data and your language coordinates and save them to short variables like “map” and “df” (for data frame).

df <- read.csv("typology.txt")
head(df) #shows you the beginning of the data, good for trouble shooting
map <- map_data("world")

In order to plot your language coordinates with additional information about language families, do this:

#family
ggplot() +
  geom_polygon(data = map, aes(x = long, y = lat, group=group), fill="#dddddd") + #plots the world map in the background in light grey
  geom_point(data = df, aes(x = Longitude, y = Latitude, fill = factor(Family)), shape=21, size=3) + #plots the language coordinates
  theme_minimal()+ #fewer embellishments
  coord_sf()+ #nice proportions
  scale_fill_viridis_d(option = "plasma")+ #color scheme
  labs(fill = "Family", y="",x="") #labels
Plot of languages as they are geographically distributed across the world, with color indicating language family

To get information about word order patterns, instead use this:

#word order
ggplot() +
  geom_polygon(data = map, aes(x = long, y = lat, group = group), fill="#dddddd") +
  geom_point(data = df, aes(x = Longitude, y = Latitude, fill = factor(Word.order)), shape=21,  size=3) +
  theme_minimal()+
  coord_sf()+
  scale_fill_viridis_d(option = "plasma")+
  labs(fill = "Word Order", y="",x="")

And since newbies sometimes use my tutorials: If you don’t understand something here, don’t give up, google it! Everyone does it, and most questions you might have, have already been asked and answered by someone somewhere.

GMT: Changing the color of a symbol within a xy file

This is a very specific problem with Generic Mapping Tools which I didn’t find well documented: if you map certain symbols to a list of coordinates specified in COORDINATES.xy, you can specify that subsets of those coordinates are mapped to symbols in different colors in two ways:

Continue reading “GMT: Changing the color of a symbol within a xy file”

Pretty WALS maps

A pretty map with WALS data, generated by GMT
A pretty map with WALS data, generated by GMT

The World Atlas of Language Structures maps data from typological studies to a world map. In addition to the online version, there is also a program for the local production of maps.

An even prettier map, in SVG format, with the background customised in Inkscape
An even prettier map, in SVG format, with the background customised in Inkscape

However, the options for customisation are limited. I use the free and open command-line tool GMT for the production of linguistic maps. It has awesome tools for all kinds of tasks, including the mapping of symbols from a file of coordinates. Here is a quick guide on how to produce your own pretty WALS map.

  1. Download your data set from WALS in tab-separated values (there is a button just underneath the header). Save it as walsXY.xy, where XY is the WALS feature you want to map.
  2. Remove the metadata lines at the top of the file and the header of the table.
  3. GMT does not distinguish between tabs and other simple blanks. Replace all simple space characters by nothing or a character of your choice.
  4. Start GMT and move to the directory to which you have downloaded your data set and where you want to produce your map.
  5. In the same folder, create a cpt file containing the colors that you want to assign to different values. My wals.cpt file has the following content: (number of WALS value, RGB values).
    1 240/11/0 
    2 0/210/240
    3 240/180/0
    4 28/142/59
    5 28/54/142
    6 90/28/142
    7 211/211/170
    8 0/0/0
  6. Run the following commands in GMT:
    pscoast -R-180/180/-70/80 -JQ7i -K -Ssteelblue > walsXY.ps
    psxy walsXY.xy -R -i5,4,2 -J -O -Sc0.15c -Cwals.cpt >> walsXY.ps
    
  7. For more options, see the documentation of GMT.