There are two problems with the code. One is the strange things it does to some characters (stripping diacritics, converting some graphemes to two separate letters by assuming they are ligatures, and so on). Fixing this would not change the code.
The other is that it doesn't escape '<' and '>' correctly so embedded HTML-like text gets improperly interpreted as HTML. One of the advantages of some of the templating systems is the default mode is to escape everything, making it harder to do XSS and other attacks. Fixing that might make the code longer, or not, depending on the solution.
(Just checking if I can make this <b>bold</b>. If so .. hmm.)