Skip to content

Included options in BibTeXFormatter class to define how the bibtex will be formatted.#13

Open
manoelcampos wants to merge 4 commits intojbibtex:masterfrom
manoelcampos:master
Open

Included options in BibTeXFormatter class to define how the bibtex will be formatted.#13
manoelcampos wants to merge 4 commits intojbibtex:masterfrom
manoelcampos:master

Conversation

@manoelcampos
Copy link
Copy Markdown

The options allow to enable/disable: insertion of whitespaces around the signal equal, that separate the field name from its values; insert the closing bracket of each bibtex entry alone in a new line; insert comma after the last field value of each bibtex entry. These options were included to allow to format a bibtex file that will be read to another application (that does not use the JBibTeX as parser) and may considerer the bibtex file invalid depending on these configurations.

You can see the bibtex sample below.
The first entry is formatted using the default options.
The second entry is formatted using the new options.

screenshot

…ll be formatted. The options allow enable/disable: insertion of whitespaces around the signal equal, that separate the field name from its values; insert the closing bracket of each bibtex entry alone in a new line; insert comma after the last field value of each bibtex entry. These options were included to allow to format a bibtex file that will be read to another application (that does not use the JBibTeX as parser) and may considerer the bibtex file invalid depending of these configurations.
@vruusmann
Copy link
Copy Markdown
Member

@manoelcampos Thanks for pointing out this issue.

In general, BibTeX formatting has not been a priority. The original idea was that BibTeXParser should collect "style information" about the parsed document and then pass it on to BibTeXFormatter - this way it would be possible to modify the contents of a BibTeX file (say, add a new entry or remove an existing entry) with minimal impact. For example, this is very important when the BibTeX file is under version control.

Also, the BibTeXFormatter could be configured using the builder approach. The Java code would be much more fluent then, and it would be easier to add new configuration options in the future.

@manoelcampos
Copy link
Copy Markdown
Author

Well, I think that to collect the "style information" dynamically requires more programming time, but it would be a great feature.
However, in some situations, I get the bibtex file formatted using a specific style and I need to transform it style to other one (due to third party applications require a specific format).

But if you are interested, I think that the build pattern will be very convenient and I may perform some refactoring in these formatter features.

First, the Builder class would have methods such as buildDefaultFormatter and buildConciseFormatter
(or other more meaningful names). This last method will build the formatter, setting the parameters as in the second style of the picture of the pull request.

And I'm thinking in perform some refactorings to include enable polymorphism and avoid some constructions likes this:
if(object instanceof BibTeXComment){
format((BibTeXComment)object, writer);
} else if(object instanceof BibTeXEntry){
format((BibTeXEntry)object, writer);
} ...

What I'm thinking is to create a specific formatter class for each type of bibtex object (comment, entry, include, preamble, XSString). Thus, an interface for the formatter classes may define a method format
and all these format classes may implement this interface.

Maybe this interface can be called BibTeXObjectFormatter.

What do you think?

@vruusmann
Copy link
Copy Markdown
Member

As you can see the class BibTeXFormatter is very minimal at the moment. So, it is possible to take it in any direction you want.

The builder pattern would be applicable in other parts of the project as well. For example, the class BibTeXParser could also use some kind of BibTeXObjectFactory for instantiating BibTeXObject classes. This way it would be possible to specify if the application wants the original "style information" preserved or not. I need to think how to attach JavaCC parser Tokens to BibTeXObjects. They are simply discarded at the moment.

If I remember correctly then there were some factory classes used in JBibTeX 1.1.X branch. This branch existed in Google Code repository (https://code.google.com/p/java-bibtex/). It was not transferred over to here.

The BibTeXFormatter builder should also pay attention to character encoding. The standard only permits US-ASCII (ie. all fancy characters need to be encoded using LaTeX syntax). However, other software may find it too difficult and prefer UTF-8 instead. So, functions withCharacterEncoding(String) and withEncodeSpecialCharactersAsLaTeXCommands(boolean) will be necessary also.

Inclusion of the Visitor pattern to allow the implementation
of different formatters classes that define how each
BibTeXObject will be formatted.

Firstly, the class BibTeXObject was changed to an interface
to define a Visitable object. Other interfaces SingleValueBibTeXObject
and KeyValueBibTeXObject were introduced. They extend the BibTeXObject.

The classes BibTeXComment, BibTeXPreamble and BibTeXInclude now implement
the SingleValueBibTeXObject interface.
The class BibTeXString implements the KeyValueBibTeXObject interface.

The class BibTeXFormatter now implements the new BibTeXObjectVisitor interface,
being a Visitor object.
This class is the only existing formatter implemented and, in the future,
it may be renamed to become more meaningful.
It has the attributes that define the style of the formatting.
As it implements the Visitor pattern, for each different kind
of BibTeXObject, there is a specific visit method
that defines how this object have to be formatted.
By this way, the if's chains that were statically checking each
object type were removed. Now, a polymorphic and dynamic
approach is implemented with the Visitor pattern.
If a new kind of BibTeXObject is introduced,
we only need to add a new visit method to the interface
BibTeXObjectVisitor and the compiler will complain
the absence of the new method in the concrete classes.

It was implemented a Builder pattern to build
different Visitor objects (formatters)
in a fluent way.
Currently, the BibTeXFormatterBuilder has
two methods: one for the standard bibtex formating
and another one for a more concise one (that doesn't inserts spaces
around the equal signal, and doesn't put the closing bracket into a new line).

The classe BibTeXFormatter was refactored to extract a lot of methods for:
make each method smaller and trying to assign a single responsability for each one;
remove duplicated code; give more meaningful names for the methods.

This class uses the new BibTeXStringBuilder classe that has
some utility methods to construct a java StringBuilder object
containing the representation of a BibTeXObject.
This class is not to construct BibTeXString objects,
but a string representation of any BibTeXObject.
But I confess that I didn't find a better name to avoid confusion :(.
Inclusion of the Visitor pattern to allow the implementation
of different formatters classes that define how each
BibTeXObject will be formatted.

Firstly, the class BibTeXObject was changed to an interface
to define a Visitable object. Other interfaces SingleValueBibTeXObject
and KeyValueBibTeXObject were introduced. They extend the BibTeXObject.

The classes BibTeXComment, BibTeXPreamble and BibTeXInclude now implement
the SingleValueBibTeXObject interface.
The class BibTeXString implements the KeyValueBibTeXObject interface.

The class BibTeXFormatter now implements the new BibTeXObjectVisitor interface,
being a Visitor object.
This class is the only existing formatter implemented and, in the future,
it may be renamed to become more meaningful.
It has the attributes that define the style of the formatting.
As it implements the Visitor pattern, for each different kind
of BibTeXObject, there is a specific visit method
that defines how this object have to be formatted.
By this way, the if's chains that were statically checking each
object type were removed. Now, a polymorphic and dynamic
approach is implemented with the Visitor pattern.
If a new kind of BibTeXObject is introduced,
we only need to add a new visit method to the interface
BibTeXObjectVisitor and the compiler will complain
the absence of the new method in the concrete classes.

It was implemented a Builder pattern to build
different Visitor objects (formatters)
in a fluent way.
Currently, the BibTeXFormatterBuilder has
two methods: one for the standard bibtex formating
and another one for a more concise one (that doesn't inserts spaces
around the equal signal, and doesn't put the closing bracket into a new line).

The classe BibTeXFormatter was refactored to extract a lot of methods for:
make each method smaller and trying to assign a single responsability for each one;
remove duplicated code; give more meaningful names for the methods.

This class uses the new BibTeXStringBuilder classe that has
some utility methods to construct a java StringBuilder object
containing the representation of a BibTeXObject.
This class is not to construct BibTeXString objects,
but a string representation of any BibTeXObject.
But I confess that I didn't find a better name to avoid confusion :(.
@manoelcampos
Copy link
Copy Markdown
Author

I performed several refactorings.
The commit https://github.com/manoelcampos/jbibtex/commit/c28aedb1b6ccaccd62d4b80562f8a00b73a570d4
has the complete details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants