The Vulture Papers

My Photo
Location: Bellevue, Washington, United States

Nathan is both a writer and designer of books and eBooks and is part-owner of boutique publisher Long Tale Press, LLC. He is available to help make your eBook or Book publishing project come alive with great book design.

Tuesday, August 30, 2005

Indexing and Search

It's been a while since I posted on this site. I've several others at the office that I use, so this one has been rather neglected of late. But lately I've been puzzling out some of the differences between an index and search, and specifically wondering how the user experience of a document can be improved through one or the other.

I'm an avid user of indexes in printed material, especially books. I've gone so far as to wonder why the Harry Potter books aren't indexed so that I can go find a reference from one book in another without re-reading the entire 660 pages.

Ah... now there's a clue. I want to find something, so I look in the index. In other words, when I'm dealing with paper documents, the index is my search tool. Now, if I could simply say to the book, "time to back out is now" and have the correct volume fall open to page 174, would I ever use the index?

I have to say that I would, but not for the same things. Instead of being a search tool, the index would focus on the other aspect of its being which is education. A well-formed index includes things like topics, sub-topics, see, and see also. By looking in an index I can find out what other things are related to the topic that I have thought of. In all likelihood, they will be things that I hadn't considered in my original search.

So, when we get to electronic documents where I have a good search tool, I believe that I still need a good index. The operative word in both topics is "good." By and large, neither search nor indexes in electronic documents are good. And they do not get better by giving me more references. When I search for "Vacation in Europe with a pre-teen," I don't want porn sites as part of my search results. I'd like fewer results that mean more. I'd like search results that educate me on other searches that I might conduct. In fact, I want a good index to this subject.

On the other hand, if I look at an electronic index for an electronic document, I don't need an alphabetical listing of every word that appears in the document. What I need is a grouping of subjects, cross-referenced with other subjects, that will guide me to correct phrasing of my question for optimum search results. In fact, I want a good search engine.

So, do search and index have the same destiny when we are dealing with electronic documents? Or will one die as paper books and their navigation recede into the past for many of our students? I believe there must be a merging of the concepts that allows rapid location of specific information and education of related topics and limits the search results to relevant information. And I want it done automatically. Is that asking too much?

Wednesday, November 17, 2004

The best page size

Well, I’ve delayed long enough on this one. It starts out “If 8.5" x 11" paper weren’t the best paper size for reading, it wouldn’t be the standard size for office documents.” I wish it were so obvious. In fact, if you are in Europe, you might be saying the same thing for A4 size paper (210mm x 297mm). So this is my opportunity to talk a little history about paper.

First off, understand that printing (or even writing) on pages is a relatively new technology. For thousands of years people rolled their words up in scrolls for which the size of the sheet was as big as you need. It’s not unlike the bottomless pages of the internet that scroll down until you run out of content.

Some of the first books bound together out of separate leaves of paper appeared in the 13th and 14th centuries. These were so significant in their difference from scrolls that they often bore the name “codex” meaning a paginated document. When the printing press came along in Europe in the middle of the 15th century, it was much easier to print on separate sheets than on long scrolls, so paginated content became the norm. To do massive runs of printed material it was necessary to have a standardized page size in an era where everything was being made by hand.

But what size?

Enter animal nature. Paper-making was still in its infancy in Europe, though it had been around for hundreds of years in Asia. Far more reliable were animal skins, specifically calfskin which was treated, scraped thin or even split, and dried to form a suitable writing or printing surface. The standardized paper size seems to be derived from the largest dependable size rectangle that could be cut out of the hide of a calf. This rectangle is approximately 24" x 34", from which two 24" x 17" sheets of printing stock, or “folios” could be cut. After gathering the folios into quires, binding them together and trimming them, the final trim size for a large book was about 11" x 17" or twice a letter size sheet of paper. It was far too costly to do extensive print runs on vellum, however, and the art of making paper was quickly advanced as the medium for printing. Paper, which could have been moulded into any desired size, seems to have evolved to reflect the dimensions of the calfskin.

So, 8.5" x 11" paper is not that size because it is desirable, but because it is an artifact of how fat a calf of butcherable size is.

Enter the philosophy of the Golden Rectangle. Many philosophers held that there was a divine proportion that was reflected in works that were most pleasing to the eye. Exactly how to compute this perfect mean, however, was a matter of dispute that has never been precisely resolved. Based on the work of da Vinci and Fibonacci who looked at the proportions of Greek temples, etc. the mystic proportion was deemed to be approximately 1:1.618. This can be constructed by creating a number string beginning with 0,1 and having each number continuing the string be the sum of the previous two. So the string goes: 0,1,1,2,3,5,8,13,21,34,55,89.... As you compare each consecutive pair of numbers you discover them getting ever closer to 1.6181 without actually reaching it. One pair will be slightly below and the next slightly above. These held that this string of numbers revealed the divine proportion and that a perfectly sized sheet of paper would have this proportion. Indeed, some newspapers today still have the proportion in broadsheet of 1:1.618. (Letter size, by the way happens to be 1:1.294, quite a way from the divine.)

The Bauhaus design movement of the early 20th century, inspired in part by the work of Swiss architect LeCorbusier charted a slightly different relationship as that of the side of a square to its diagonal. This is approximately 1:1.414. This relationship has the great advantage of being proportionally stable when folded in half. Taking a sheet of paper with a surface area of 1 square meter in dimensions proportionate to 1:1.414, you can fold it down until it reaches as size of 210mm x 297mm and it will still have a width to height ratio of 1:1.414. Hence we get the A series of papers and A4, now commonly in use in most of the metric world.

Many other sizes have been promoted as being the idea reading size with more or less success. My personal philosophy is that the perfect reading size for the page is proportionate to the amount of type that you are putting on it because, frankly, people don’t read the page, they read the type. We can determine an appropriate width of a line of type as discussed in the post on Optimum Line Length below. Determining the right height of a column of text or a page of text is something that we will have to leave up to individual aesthetics and physical circumstances.

Saturday, November 06, 2004

Resolution is everything/nothing

Back in the 80s when desktop publishing was new, designers, typographers, and printers all came to me with the same complaint regarding the quality of laser type and even of the first Linotronic­® set PostScript® type. The word that was used was “nervous” type. It had jagged edges. It lacked the perfectly smooth curves of phototypeset copy. It would never catch on.

In all fairness, while those of us who had adopted the technology early were oohing and aahing over 300 pixel per inch laser print compared to 72 ppi ImageWriters®, the vast consuming public (meaning the design and production component) were bemoaning the destruction of good typography and touting how laser print diminished readability. The early Linotronic imagesetters were a mere 1270 ppi and were seen only as an incremental improvement over the LaserWriter®, even though the award-winning color coffee-table book WhaleSong was set on that very device. It wasn’t until the next generation of imagesetters that bumped things up to 2540 ppi or even 3600 ppi on Agfa’s SelectSet® that designers and publishers exhaled and accepted the electronic type as “almost” as good as photo-typeset.

And so it is today that with every monitor I get that bumps the resolution up another dot, I breathe a sigh of relief that we are becoming more readable. 120 ppi, ClearType® technology, 80 hertz scan rate, and I’m nearly ready to concede that I’m comfortable reading on screen. I can’t wait for a 150 ppi monitor, and I’ve seen a couple $10,000 monitors that run over 200 ppi. That will be the day.

I held that opinion until I did a seminar on readability and a 20-year-old stood up with a comment. “I don’t get it,” he said. “I’ve always read on a monitor and the type is just fine. What we don’t have is content.”

That set me back as I realized that this guy was born the year I got my first Macintosh® with an ImageWriter. He has lived in a world where his first reading experiences were synonymous with pixelated type both on-screen and in print. The vast majority of commercial publications are set on “desktop” publishing systems and reading on-line is as much a part of this generation’s day as a morning cup of coffee is to mine. And I suspect that the characteristics of readable type for this generation will begin to include pixelation artifacts just as the chisel marks of stone carving that were integrated into the design of lead type as serifs have become an issue of readability western culture.

I’m still looking forward to a higher resolution monitor with better typography and clearer type, but I’m now ready to consider if not concede that viewing it as a barrier to readability may well be a generational thing, not an absolute.

All registered product names are owned by their manufacturers and are used here only as reference. Owners include Linotype Corporation, Agfa Inc., Apple Computer, and Microsoft Corporation.

Sunday, October 31, 2004

Bigger type please!

Let’s take a few minutes to look at type on a colored background. Does it need to be bigger?

A few years ago I took a number of typographic “schemes” to the president of our company for approval. He looked at some and said “Okay,” and at others and said “This type needs to be bigger.” After I got back to my desk to make the modifications I realized that he had both okayed and criticized the same scheme on numerous instances, but that the difference was in how the type was displayed against a colored background. So I went to work trying to figure out what was there beyond his obvious “make it bigger” request.

What I discovered was that it is not so much the size of the type as the contrast between the letters and the background that makes certain type hard to read. As you saw in the section on emphasis below, there are circumstances in which you simply need more contrast to make something more readable. Contrast can be achieved in several ways. The first is by making the difference between the value of the type and the value of the background as large as possible. This is a color issue. But sometimes you want to use colors that simply don't have that much range between their values. So what do you do then?

There are other kinds of contrast than just the contrast of value. One, true to my president’s desire, is contrast of size. It is true that making the type bigger may make it easier to read. Not necessarily proof positive, however. Contrast of texture is another way to achieve better readability. Instead of making the type bigger, make it heavier. You can vary weights of type in HTML by fairly small amounts rather than requiring actual bold, demi, or heavy typefaces; and respecting typographic purists who want to honor the typographer’s intent, when you put the type on a colored background, you are already altering the intent so feel free to make minor alterations in weight for the surface you are working on. Contrast of density may be another way of making the type more readable. That is simply spreading it out a little more, both vertically and horizontally.

Surprisingly, all these methods actually work to improve the readability of text on a colored background, proving once again that there is no single answer to any of our reading problems, but a variety of options for solving them. My opinion is to try them all and see which one fits your use best. It may be simply making it bigger.

Tuesday, October 26, 2004

Is this important?

Or is this?

One of the older myths of page design held that if it was important it should be printed in red. And in some cases, red type does, indeed, draw attention. But in terms of sheer page dominance, red doesn’t hold a candle to strong pure black.

The value of a color often is less significant in cueing your audience to the importance of an issue than the background, position, and graphic treatment of the text. Let’s try that again:

Now this is important!

The first thing you notice is that if you want to use a color to emphasize a point, it should be the dominant thing that people see. The background behind this block of text is far more telling than the text itself. The fact that it creates a significant block across the page rather than just being behind the text portion is another thing that draws your eye to the color. Finally, the level of contrast between the text and the background is high enough that both read well. You'll notice that the red of the background is much more orangish than the red of the first line of type in this post. I’ll talk more about contrast in the next post.

So, if you want to use color to show something is very important, it helps to use the color to create a background or a solid block with type in it rather than to color the type on the same background as everything else. Let’s take one more look at this form to see if it works in reverse.

This is really important.

It makes a big statement whenever you put a black band across your page, but note that in order to make the type stand out against this, it is even yellower and not nearly as red. The reason? There simply isn’t enough value contrast between black and red to keep the type from disappearing into the background.

The rules for emphasizing text:
  1. Contrast is more important than color

  2. Large blocks of color draw the eye more rapidly than colored type

  3. Colored text against a black background needs to be even lighter than the background when black text is used against it

It’s as easy as 1-2-3.

Thursday, October 21, 2004


Why is it that people find pages easier to read than scrolls?

Well, if we were dealing exclusively with physical pages and scrolls, the answer would be fairly obvious. Shear logistics in finding a place in a scroll as opposed to flipping through a few pages would make preference to pages obvious. But in the electronic world is there any benefit to having pages instead of scrolling screens of content?

The answer to this is still both yes and no. If we compare paginated content in the form of fixed pages that relate to something other than the size of the screen or its viewing area to a bottomless scrolling page, we have to say that there is not a lot of benefit to pagination. The page is seldom the right size for the viewing environment and often requires oddball scrolling before pages even come into play.

However, if we compare the bottomless scrolling content to a page that is properly formatted and laid out for the viewing area on-screen, some advantages can be seen. First and formost is the problem of the reading line-length of the content. Many scrolling pages have content that is so wide that it is difficult to track from the end one line to the beginning of the next. Secondly, as you scroll, the eye fights to maintain its place in the content. If you use single line scrolling, you are constantly tapping the down arrow, trying to stay in a rhythm with your reading speed. If you use page-at-a-time scrolling, the eye often loses its place at the top or bottom of the screen before settling on a start point for the new reading experience. Both of these problems disappear with paginated content. If it is properly formatted for the screen size, then the line-lengths would be correct, showing multiple columns or appropriate margins to surround the type. The eye automatically snaps to the top left corner of the page (in western culture) an begins reading with the first character every time, reducing the amount of time spent looking for the right place to continue reading.

This does not necessarily improve the locating of content. In fact, in some instances it may be harder to find specific location in paginated content than in scrolling content. Most paginated reading experiences, for example, make it difficult to tell how far into the content you have progressed compared to the elevator box on a scrollbar. Even if you have page numbers, are you on page 100 of 500 or page 100 of 101?

Many people have also suggested that pagination that differs from viewing surface to viewing surface (including the printed book) is highly confusing and makes it difficult to find a particular location if previously viewed in a different layout, or to match locations in a class where there are multiple layouts. I want to suggest that this problem is not better in a scrolling environment and that in either case we need better visual cues and navigation systems in order to make on-screen reading a palatable experience.

Does paginated content offer a better reading experience than bottomless scrolling content? I would have to say that in most of the current manifestations it is a toss-up. But the potential exists to make a phenomenal reading experience from paginated content that is combined with a great user interface and navigation system.

Friday, October 15, 2004

Optimum Line Length

If you go entirely by empirical research, you will discover that from the first books printed in English to today’s paperback novels, there are about 65 characters per line in material that is meant to read immersively. Several studies have looked at the optimum reading line length in English and agree that for immersive reading experiences, around 65 characters per line is best. Even design books will tell designers to use the two-and-a-half alphabet line length as a guideline for best readability. But why is it?

An examination of the mechanics of the eye shows that there is a parafoveal angle of the eye that is approximately 12°. Within that, there is a foveal angle that is about three-quarters of a degree. The foveal angle is the amount of input into the eye that is sharply in focus. The parafoveal is the amount of information that is registered, but largely ignored by they eye as being slightly out of the direct line of sight. Beyond this 12° range we enter peripheral vision. The interesting thing is that this marks out the range of eye movement that can be done without engaging the neck muscles to turn the head as one reads. So, one might say that there are 16 chunks of sharp vision in the 12° range as the eye scans a line of type before returning to the beginning of the line.

But how much information can be acquired in each resting point. Some studies have shown that we can sharply see and register about 4–5 characters within the foveal fixation (coincidentally, the average length of an English word). What is not readily available in that information is whether the character recognition was based on a common book line of type or if it was scientifically arrived at by quantifying how much information the brain is able to comprehend from one foveal fixation point. It seems curious that the line length conveniently works out to 4 times 16 or 64 characters as the optimum.

Without getting into a criticism of what we base our opinions on, I think that there is an open question that arises out of the investigation that no one has broached. Is it the number of characters within the foveal range or the amount of information that is contained therein. If it were the number of characters, we should expect that there should be 4 or five Chinese characters in a fixation of the eye. But each Chinese character is an entire word. So where we have five characters to make up one word, the Chinese would find five words. Can the eye grasp that amount of information in a single fixation? Or does it imply that Chinese should be written with only 32–33 characters per line instead of 64–66?

I raise these as questions, acknowledging that a lot of work has been done on this subject, but not yet having found the exact universal algorithm that would translate typesize intuitively across cultures and script types. I’m not yet ready to propose an answer, even though I’ve proposed a couple patents on technology that would offer one solution. For here, it is enough to question.