Content determination occurs when a person, group or program decides what information should be included or excluded within a document or text. It is related to concepts surrounding document structuring. It is also related to natural language generation and computational linguistics. Each area of study uses content determination to examine how information is chosen.
When considering what to put in a document or text, the compiler will have conducted his or her research or will have been given all available data. Content determination covers the ways this information is whittled down into the final document. This is done by determining what the angle or objective of the text is and what information within the text is relevant to this.
The second consideration of content determination is its style. This tends to depend on the nature of the intended audience. The audience’s intellect and familiarity with the subject matter will alter the lexical density and complexity of the information being imparted. Academics will tend to produce denser texts than gossip magazines, for example. Other considerations include the size of the format, whether it will be a book, article or a text message.
Each state of content determination is done by a human. There is the researcher and the writer, who are often, but not always, the same person, and then the editor or editors. Each level has an opinion over what content is relevant to the text’s objective. Computational linguists and computer engineers have been looking for ways to reproduce this system using computer programs instead of relying on humans.
There are three computational techniques employed by computers regarding content determination. The ‘schema technique’ is based on the examination of written texts. It uses the pre-examined texts as a basis for what information to include within the text being produced. The ‘statistical’ method automatically determines content based on a slew of general statistics. The ‘explicit reasoning’ uses artificial intelligence (AI) to examine and filter the information.
The overall aim of content determination is to understand how documents are produced so it can be reproduced using computers. The result of such a success will be a computer able to receive data, filter it and produce summaries of the most important information. The computer will base such documents not only on the information, but the objectives of the text being produced. In the vein of the China room theory, this may mean the computer is able to understand the data rather than being able to replicate and calculate.