Generated Content | Excavation

This foxhole has been quite silent. There are lots of reasons for this. The main reason is that I write lots of other stuff for other blogs, technical documentation, or code. I recently came across a text a colleague sent me. He asked for feedback on a manual-like document. The text was reworked multiple times, and it was another iteration with additional text and rearrangement. ChatGPT created some parts of it. After spending more than an hour with the document, my feedback contained criticism for vague terms, missing definitions, too many general statements, and a complete lack of structures that could act like a manual. It looked liked a collection of thought and references along with some instructions. I could not pinpoint what part was created by ChatGPT, but the document had no good flow when reading it. So this is the reason for this article.

Using Large Language Models (LLMs) for text generation has spread to widely available tools and platforms. These models have gone through some iterations, but they only grew bigger. The underlying technology is still the same. Start-ups (i.e. companies with no solid business model) hope that making these models more complex will yield better results. The generated texts look good and are easy to read, but they completely lack any thoughtful structure. If you write a manual to instruct persons in doing something sensibly, then the generated texts require some didactics. You need to divide the topics into sensible groups. Instructions must be easy to understand and concise. Technical documents require a clear definition of terms and concepts. All topics should be in the right order. This usually means in ascending order of complexity. LLMs cannot do this because they have no cognitive skills. This is not meant as arrogance, machines cannot think, at least for now.

People use LLMs for summarising content. If you use common search engines, then you will find lots of tutorials on how to use “AI” tools to create summaries. The problem is, that the LLM engines cannot summarise content. They just remove most parts and reorder the rest. The result may look like a summary, but it is not. There is a reason why writing summaries is a tool for teaching. You can check if someone understood the key points of a document, book, or novel. A good summary requires a lot of thought, because you need to use the key ideas and remove everything else. I have read some LLM-generated summaries. When reading, I noticed a lack of structure and repetitions. Someone gave me a three-paragraph text to help me write documentation. I had to rewrite the piece, because it wouldn’t fit. This is another noticeable difference. If you ask a person to produce a text fragment which can be inserted into a different document, then this person needs to have the skill to write a fragment. A fragment is not a randomly shortened text. You need to know how the beginning and the end can be connected to other texts–which you do not know yet.

And there is the question of energy consumption. LLMs are destroying Iceland right as you read this sentence. Think twice before using lots of energy and maybe money for mediocre results.

Excavation

Deep is the new shallow – or vice versa.

Tag Archives: Generated Content

The missing “Value” of “AI”-generated Content