Included options in BibTeXFormatter class to define how the bibtex will be formatted.#13
Included options in BibTeXFormatter class to define how the bibtex will be formatted.#13manoelcampos wants to merge 4 commits intojbibtex:masterfrom manoelcampos:master
Conversation
…ll be formatted. The options allow enable/disable: insertion of whitespaces around the signal equal, that separate the field name from its values; insert the closing bracket of each bibtex entry alone in a new line; insert comma after the last field value of each bibtex entry. These options were included to allow to format a bibtex file that will be read to another application (that does not use the JBibTeX as parser) and may considerer the bibtex file invalid depending of these configurations.
|
@manoelcampos Thanks for pointing out this issue. In general, BibTeX formatting has not been a priority. The original idea was that BibTeXParser should collect "style information" about the parsed document and then pass it on to BibTeXFormatter - this way it would be possible to modify the contents of a BibTeX file (say, add a new entry or remove an existing entry) with minimal impact. For example, this is very important when the BibTeX file is under version control. Also, the BibTeXFormatter could be configured using the builder approach. The Java code would be much more fluent then, and it would be easier to add new configuration options in the future. |
|
Well, I think that to collect the "style information" dynamically requires more programming time, but it would be a great feature. But if you are interested, I think that the build pattern will be very convenient and I may perform some refactoring in these formatter features. First, the Builder class would have methods such as buildDefaultFormatter and buildConciseFormatter And I'm thinking in perform some refactorings to include enable polymorphism and avoid some constructions likes this: What I'm thinking is to create a specific formatter class for each type of bibtex object (comment, entry, include, preamble, XSString). Thus, an interface for the formatter classes may define a method format Maybe this interface can be called BibTeXObjectFormatter. What do you think? |
|
As you can see the class BibTeXFormatter is very minimal at the moment. So, it is possible to take it in any direction you want. The builder pattern would be applicable in other parts of the project as well. For example, the class BibTeXParser could also use some kind of BibTeXObjectFactory for instantiating BibTeXObject classes. This way it would be possible to specify if the application wants the original "style information" preserved or not. I need to think how to attach JavaCC parser Tokens to BibTeXObjects. They are simply discarded at the moment. If I remember correctly then there were some factory classes used in JBibTeX 1.1.X branch. This branch existed in Google Code repository (https://code.google.com/p/java-bibtex/). It was not transferred over to here. The BibTeXFormatter builder should also pay attention to character encoding. The standard only permits US-ASCII (ie. all fancy characters need to be encoded using LaTeX syntax). However, other software may find it too difficult and prefer UTF-8 instead. So, functions withCharacterEncoding(String) and withEncodeSpecialCharactersAsLaTeXCommands(boolean) will be necessary also. |
Inclusion of the Visitor pattern to allow the implementation of different formatters classes that define how each BibTeXObject will be formatted. Firstly, the class BibTeXObject was changed to an interface to define a Visitable object. Other interfaces SingleValueBibTeXObject and KeyValueBibTeXObject were introduced. They extend the BibTeXObject. The classes BibTeXComment, BibTeXPreamble and BibTeXInclude now implement the SingleValueBibTeXObject interface. The class BibTeXString implements the KeyValueBibTeXObject interface. The class BibTeXFormatter now implements the new BibTeXObjectVisitor interface, being a Visitor object. This class is the only existing formatter implemented and, in the future, it may be renamed to become more meaningful. It has the attributes that define the style of the formatting. As it implements the Visitor pattern, for each different kind of BibTeXObject, there is a specific visit method that defines how this object have to be formatted. By this way, the if's chains that were statically checking each object type were removed. Now, a polymorphic and dynamic approach is implemented with the Visitor pattern. If a new kind of BibTeXObject is introduced, we only need to add a new visit method to the interface BibTeXObjectVisitor and the compiler will complain the absence of the new method in the concrete classes. It was implemented a Builder pattern to build different Visitor objects (formatters) in a fluent way. Currently, the BibTeXFormatterBuilder has two methods: one for the standard bibtex formating and another one for a more concise one (that doesn't inserts spaces around the equal signal, and doesn't put the closing bracket into a new line). The classe BibTeXFormatter was refactored to extract a lot of methods for: make each method smaller and trying to assign a single responsability for each one; remove duplicated code; give more meaningful names for the methods. This class uses the new BibTeXStringBuilder classe that has some utility methods to construct a java StringBuilder object containing the representation of a BibTeXObject. This class is not to construct BibTeXString objects, but a string representation of any BibTeXObject. But I confess that I didn't find a better name to avoid confusion :(.
Inclusion of the Visitor pattern to allow the implementation of different formatters classes that define how each BibTeXObject will be formatted. Firstly, the class BibTeXObject was changed to an interface to define a Visitable object. Other interfaces SingleValueBibTeXObject and KeyValueBibTeXObject were introduced. They extend the BibTeXObject. The classes BibTeXComment, BibTeXPreamble and BibTeXInclude now implement the SingleValueBibTeXObject interface. The class BibTeXString implements the KeyValueBibTeXObject interface. The class BibTeXFormatter now implements the new BibTeXObjectVisitor interface, being a Visitor object. This class is the only existing formatter implemented and, in the future, it may be renamed to become more meaningful. It has the attributes that define the style of the formatting. As it implements the Visitor pattern, for each different kind of BibTeXObject, there is a specific visit method that defines how this object have to be formatted. By this way, the if's chains that were statically checking each object type were removed. Now, a polymorphic and dynamic approach is implemented with the Visitor pattern. If a new kind of BibTeXObject is introduced, we only need to add a new visit method to the interface BibTeXObjectVisitor and the compiler will complain the absence of the new method in the concrete classes. It was implemented a Builder pattern to build different Visitor objects (formatters) in a fluent way. Currently, the BibTeXFormatterBuilder has two methods: one for the standard bibtex formating and another one for a more concise one (that doesn't inserts spaces around the equal signal, and doesn't put the closing bracket into a new line). The classe BibTeXFormatter was refactored to extract a lot of methods for: make each method smaller and trying to assign a single responsability for each one; remove duplicated code; give more meaningful names for the methods. This class uses the new BibTeXStringBuilder classe that has some utility methods to construct a java StringBuilder object containing the representation of a BibTeXObject. This class is not to construct BibTeXString objects, but a string representation of any BibTeXObject. But I confess that I didn't find a better name to avoid confusion :(.
|
I performed several refactorings. |
The options allow to enable/disable: insertion of whitespaces around the signal equal, that separate the field name from its values; insert the closing bracket of each bibtex entry alone in a new line; insert comma after the last field value of each bibtex entry. These options were included to allow to format a bibtex file that will be read to another application (that does not use the JBibTeX as parser) and may considerer the bibtex file invalid depending on these configurations.
You can see the bibtex sample below.
The first entry is formatted using the default options.
The second entry is formatted using the new options.