Using Compound Words on the Web
 Within every industry compound words are created, then used extensively, often without a second thought once they enter that industry’s mainstream lexicon. The same is true of acronyms, abbreviations, and buzzwords. For instance, look at the web development industry. We use all sorts of verbal shortcuts to convey information that our core readers will have no problems with. Eventually many enter the public mainstream and end up in the dictionary.
 Within every industry compound words are created, then used extensively, often without a second thought once they enter that industry’s mainstream lexicon. The same is true of acronyms, abbreviations, and buzzwords. For instance, look at the web development industry. We use all sorts of verbal shortcuts to convey information that our core readers will have no problems with. Eventually many enter the public mainstream and end up in the dictionary. 
The process usually goes something like this:
- Two words are used jointly so often they become a common occurrence.
- Eventually they are joined by hyphenation to ensure they are spoken properly.
- If grammatically correct, the hyphen is eventually removed and a compound word is formed.
- The compound word is added to the dictionary and becomes part of common language.
Case in point: Multilevel. Long ago this probably wasn’t a word. But somewhere along the way — when man moved out of the cave and into the apartment — someone started describing a construct consisting of more than one level as having multiple levels. To those using the term often it was at some point shortened by hyphenation to multi-level. Eventually this was widely adopted and compounded to give us the new word we use today — without giving it a second thought.
But before compound words like this become that widely accepted and known to the general population, these shortcuts can create a small accessibility barrier (readability hiccups) on the web in that while visually recognized, they make be aurally bastardized. I will offer one example: Site Map.
The Problem
Written as two words, readers will see it as “Site Map” and to most readers (and listeners) it’ll be apparent what it is and what it means. Within the web development industry, though, we will often create a new word out of those two by compounding them: Sitemap. Visually we still pretty much know what that new word means, but aurally it can be — and is, but perhaps not by all screen readers — mispronounced as “sit-a-map.” I’ve heard this before using the Opera Voice plugin. Over time I predict this will be corrected; after dictionary publishers catch up with the new word’s adoption, then when screen reader software (soft ware) manufacturers catch up with them. But until that time, care must be taken by the web developer who wants to ensure his or her content is not only accessible, but readable and understandable as well.
For some homegrown words this isn’t a problem and will be pronounced correctly no matter how they are written by virtue of the possible ways they can be enunciated. Others, though, are readily subject to mispronunciation — to the point that the listener may wonder what the words mean. Care must be taken.
The Fix
This is really quite simple. Don’t compound words on the web before their time. That’s pretty straightforward and easy to implement, and it’s safe. But there are other ways to deal. I suppose one could identify the usage with the defining instance element — <dfn title="Site Map">Sitemap</dfn> — but this is impractical. Another workaround is to hyphenate, of course. And yet another method, one I learned from this Johan DeSilva article entitled Building better websites by understanding blind users browsing behaviour, is to use camel case, like so: SiteMap. According to the article, screen readers will speak that as if it were two words: Site Map. 
I certainly have much to learn about screen readers and how disabled users interact with them and what their capabilities are, but this is food for thought.
 
  
 
  
Tommy Olsson responds:
Posted: March 15th, 2007 at 2:25 am →
Speech synthesizers need to improve, to learn to recognise compound words wherever possible. This may not be a huge problem with English, but in languages like Swedish, German, Dutch and Finnish compound words are very common and can become ridiculously long. One example in Swedish is a word that I encounter every now and then at the office: arbetsmiljökonsekvensbeskrivning (compounded by no less than four words: ‘arbete’ [work], ‘miljö’ [environment], ‘konsekvens’ [consequence], and ‘beskrivning’ [description]).
Not to mention real sesquipedalians like kontraktsanställningsålderstilläggsklassföljsamhet… 
Mike Cherim responds:
Posted: March 21st, 2007 at 3:16 pm →
That’s a mouthful, Tommy. Or an ear-full I guess if rendered aurally. I do have to wonder in what context “workenvironmentconsequencedescription” would be used.
Tommy Olsson responds:
Posted: March 23rd, 2007 at 6:28 am →
It’s something we’re required to do whenever we make any sort of changes that are likely to affect the work environment, either physically or psychologically.
Gill responds:
Posted: March 24th, 2007 at 1:33 pm →
Website is another one. Opera pronounces it Webseat. You just don’t think about it until you hear it.
I have to say though that arbetsmiljökonsekvensbeskrivning is not a word I’ve had a problem with.