Feb. 4, 2025

So far Clojure is not the best choice of language when it comes to generating code using LLMs. This is probably because it is a niche language and so it is less well represented in the training set. LLMs frequently hallucinate functions that don't exist and other problems when writing Clojure code, and they seems to simply have more trouble writing good Clojure code.

However there may be some advantages to using Clojure when generating code with LLMs. These boil down to Clojure being very concise, and the LISP syntax being easier to verify.

  • Higher information density. Because Clojure is a concise language you can express more information in less tokens. This means the LLM can use less compute for the same outcome. Because it is a LISP there are less superfluous tokens wasted on syntax. So the LLM can be more focused on important information to the task.

  • Smaller context. Some LLMs now have very large context available, but it still seems to be the case that they perform better with shorter more focused context. Once you add too much context they sometimes lose focus on the main task. Again because Clojure code is concise you can use less context space to add the required context and code examples.

  • A consistent syntax and more functional code means easier validation. LLMs perform far better when they are used in a loop where the generated code is linted, tested, or otherwise validated, and then piped back into the LLM with any errors to correct. Clojure lends itself particularly well to this as it's easier to test functional code in isolation and it's easier to parse, lint etc.

In my experience it's possible to get better outcomes when generating Clojure by providing relevant context and code examples. In future we might see code generating LLMs that are explicitly trained on more Clojure code, which would probably also help.

Need software development advice? Book a call with me.