Why I am not using Zend_Markup
Sunday, February 28, 2010
About three months ago I took notice of the new Zend_Markup component and I was very pleased to see such a component in Zend Framework. To my own fault I forgot about it follow the development process in the incubator and only remembered it again two weeks before the Zend Framework 1.10 release. By that time it had many flaws and inconsistency in the code, to name just a few of them:- Converter codes where called tags, which only was correct for the BBCode to HTML conversion
- It didn't follow the PHP coding standard completly
- The documentation was more than incomplete
- There were majors bugs and inconsistency against other BBCode parsers
When Matthew was ready to create the 1.10.0 tag, I mentioned to him that Zend_Markup would still require some more time before it could make it into the Zend Framework core, tho it was already merged into trunk by him. I was really wondering at that point, why Matthew didn't look deeper into the code while reviewing it, but on the other hand he cannot always understand the complete complexity of a component, so I don't really blame him for that. Tho he should have spotted that the documentation was incomplete and that the coding standard was violated at some points.
As he was going to create the 1.10.0 tag and I told him about some flaws which were still left, he moved the release two days further, so Pieter Kokx had still some time left to fix the missing parts. At that point I already forgot about the documentation and Zend_Markup made it into the 1.10 release.
A few days later, I wanted to implement Zend_Markup in my blog, replacing the old StringParser_BBCode. After I converted all callbacks and implemented all BBCodes I started unit-testing Zend_Markup against my old implementation. At that point I started spotting other major bugs which should not be unique to my own code:
- BBCodes withing a code-tag would have been rendered after the closing code-tag
- Newline characters would be rendered around block-level elements
- List-items cannot contain newlines within, as they are treated as stoppers for those
- Converter codes were partly still called tags
When I noticed the problem with newline characters, I told Pieter that I'd create an issue ticket together with a patch, which I planned to work like in my own implementation to handle newline-characters around and within tags. To illustrate the problem, think about the common usual BBCode:
This code would (somewhat) result in the following HTML with Zend_Markup:This is a nice list:[list][*]First item[*]Second item[/list]That's the list
As you can see, there will be unwanted newlines rendered around the <ul> block-level element. After some tinkering around I had to find out that it was impossible to implement without major refactoring, which would affect the backward compatibility and thus had no chance to be fixed before Zend Framework 2.0.This is a nice list:<br /><br /><ul><li>First item</li><li>Second item</li></ul><br /><br />That's the list
In the end there is a component left which has neither a complete documentation (many methods are not documented in the manual) nor it can be used to full extend as a usual BBCode-Parser. I ended up refactoring the BBCode-parser I partly wrote withing my company and using it as final replacement until I can see that Zend_Markup does not suffer by those teething problems.
Again I'm really disapointed that this component was rushed into the 1.10 release, which again was partly my fault by not noticing given problems earlier. The probably best thing would have been to postpone it to 1.11 or 2.0, so that the all problems requiring a BC-break could be fixed and the documentation be fixed, but now it's too late for that.
Comments to this article
Leave a comment
Please note that your email address will not be shown, it is only used to fetch your avatar image from gravatar.com and for notifications.


When I'm reviewing code, I'm typically looking at coding standards, tests, and documentation. I'm curious what CS issues you saw in trunk, as I actually went through and revised/refactored code to conform to CS prior to pushing into trunk. As for tests, these were doing an adequate job of testing the behaviors outlined in the proposal, as well as some additional ones. Documentation could be better, but showed at least minimal use cases; advanced use cases are often fleshed out during the RC phase or following the release once we get more people using the component.
I will be the first to admit that I do not always evaluate the algorithms or approach used in the code to accomplish their tasks. If I am unfamiliar with the problem domain, it makes sense to leave that task up to community review -- which, as you note, may have happened late in the release process. We have had very few occasions, if any, to remove code from trunk after it has already been promoted; it feels unfair to the contributor to do so, since they have already put in much work, and typically the code has been sitting in the incubator for weeks if not months, where it could be reviewed prior to promotion. Incubation is the appropriate time to make architectural critiques.
When you raised your concerns, we were at relatively late stages of the release process. The code as written was basically doing what it should be doing, and while I understood you had architectural concerns, I did not feel they should stand in the way of release, particularly when the contributor was not only willing to work to fix things prior to release, but was actively doing so.
Considering how soon 2.0 will be upon us, it also feels reasonable to get functionality out that developers can use *now*, and refactor as we approach 2.0. I hope that you will be willing to assist, as you certainly have some strong ideas on how to do so
In fact I just looked through fisheye and it seems that I confused it a bit. Most coding standard violations were actually fixed prior to the promotion, and I'm not perfectly sure about the remaining ones. At that point it depends on how strict you consider the coding standard, but as far as I remember, there should be no kind of inline-comment outside of methods, like for example at the end of:
http://framework.zend.com/code/browse/~raw,r=19720/Zend_Framework/standard/trunk/library/Zend/Markup/Renderer/RendererAbstract.php
But as I said, it depends on how strict you see such stuff. PHPCS would probably see it as issue, some humans may not. As for the tests, I didn't mention them as we already spoke about that and I accpeted your valid points there.
I'm sure we can do the appropriate refactoring for 2.0, and as I mentioned, I'm interested in using Zend_Markup when it fits my expectations.
There are no restrictions against those; typically, we'll use a block comment for that sort of thing, but the CS doesn't address it one way or the other. Such comments are useful for giving a reviewer a visual clue as to how methods are grouped within the class.
Glad to hear you're on board for helping improve the component!
BTW, 2 things about your blog: 1) it'd be nice to get emails on follow-up comments. 2) the first time I enter a captcha always fails; it only takes on the second try.
The email-notify is planned for the next version (9.2), as for the captcha, that has to be a bug in Zend_Captcha then, maybe the lifetime is not long enough, you can report that in Jira ;)
Awesome -- got a comment notification just now. :)
As for Zend_Captcha -- odd. Most likely it's a expiration-hops issue. I'll see if I can debug it sometime soon.