Skip to content

Emulate BiBTeX's behavior #1231

Closed
tkw1536 wants to merge 25 commits into
brucemiller:masterfrom
tkw1536:bibtexml
Closed

Emulate BiBTeX's behavior #1231
tkw1536 wants to merge 25 commits into
brucemiller:masterfrom
tkw1536:bibtexml

Conversation

@tkw1536
Copy link
Copy Markdown
Contributor

@tkw1536 tkw1536 commented Jan 27, 2020

I've been working on this PR for some time. It changes the way BiBTeX is handled in LaTeXML. In short, it adds a '.bst' file emulator to format bibliography entries exactly the way BiBTeX does.

The new approach has the following advantages:

  • more closely emulate the behavior of tex
  • support for any kind of BiBTeX style, not just the pre-existing ones
  • we can now keep track of source references in<ltx:bibblock>s

It comes with some caveats too:

  • emulation is slower than the original approach, this is because we have to emulate a BiBTeX stack
  • splitting the bibliography is no longer easily possible, because ordering is now determined by the '.bst' file
  • it is unclear what should happen with support for bibliographies in XML format, the PR in its' current state does not really support them

This PR is split into three commits:

  • Adding the BiBTeX emulator (but not changing LaTeXML itself)
  • Updating 'MakeBibliography' to make use of this emulator
  • Adding a new test for the new approach

After offline discussions with @brucemiller about this: He doesn't expect to merge this PR as is, but it should provide a starting point to where we have a mostly feature-complete, working version of the new approach.

@tkw1536 tkw1536 force-pushed the bibtexml branch 5 times, most recently from 0d1181b to 403a136 Compare January 31, 2020 21:39
@tkw1536
Copy link
Copy Markdown
Contributor Author

tkw1536 commented Feb 10, 2020

There are still some things that need doing before this can be merged.
Here is the list I can think of:
There are still a few things to take care of before this can be merged.

  • rethink support for LaTeXMLs xml serialization of ‘.bib’ files
  • remove BiBTeX.pool?
  • think about where bibtex “integration” tests should go and when they should be run
  • fix the non-working test cases and remove them if their mode is no longer supported
  • rethink what should happen if a .bst does not exist. Currently it’s an error if the file doesn’t exist and a fatal if it can’t be parsed
  • Optimisation: Only instantiate one latexml instance for running the .bbl, then use daemon frames when re-running. Perhaps somehow re-use STATE from the parent document (if available)
  • rethink by which attribute to spit by
  • rework “ltx:tag”s, should we perhaps format them according to the old MakeBibliography?
  • adapt codestyle
    /cc @brucemiller This is all I can think of. Anything I forgot?

tkw1536 and others added 24 commits June 15, 2021 14:38
With this commit we add the former BiBTeXML project as a new subsystem
to LaTeXML.
This commit replaces the old MakeBibliography with a new version based on the
BiBTeX emulator.
This commit adds a simple daemon test for the new BiBTeX interface.
This commit adds a new target 'make bibtest' to Makefile.PL that runs the
BiBTeX tests.
This commit inlines convenience imports to LaTeXML::Post::BiBTeX::Compiler.
This commit inlines imports to LaTeXML::Post::BiBTeX::Bibliography.
This commit inlines imports to LaTeXML::Post::BiBTeX::BibStyle.
Previously, the empty string was treated to contain exactly one name,
the empty name. This caused problems in specific .bst files that relied
on a name count eventually becoming 0.

This commit updates the behavior, and makes num.names$ and splitNames
return 0 and the empty array, respectively, for the empty input string.
Previously, when an invalid index was passed to 'text.substring' the
function would return undef because of a perl implementation detail.

This commit updates the behavior to instead return the empty string when
an invalid index is passed.
Previously, when defining variables inside a .bst file, these were not
initialized. This caused certain stylesheets, which expected them to
have a sensible default value, to fail.

This commit silently initializes newly defined variables.
@tkw1536
Copy link
Copy Markdown
Contributor Author

tkw1536 commented Jun 15, 2021

While working on biblatex, I've found a couple of minor bugs on the branch.
I've got a master-rebased version that can run biblatex.bst without running into infinite loops.

@brucemiller
Copy link
Copy Markdown
Owner

If you've gone to the trouble of rebasing, you might as well force push it. It'll make the rest of the work easier, even if I end up stealing your code rather than merging.

But I'm curious that interpretting biblatex.bst depends on code from this branch? Seems likely that it would depend on bib-related infrastructure in LaTeX.pool, but not the actual BibTeX interpreter. If it's BibTeX/BibLaTeX infrastructure, that probably deserves a separate PR, doncha think?

@tkw1536
Copy link
Copy Markdown
Contributor Author

tkw1536 commented Jun 15, 2021

One can do \usepackage[backend=bibtex]{biblatex}, which generates the bbl using bibtex as opposed to biber. That depends on a couple of (undocumented as always) bibtex implementation details. I haven't integrated this running with biblatex as a whole (yet).

Regarding implementation details:

  • I figured out that bibtex counts 0 names in the empty string, whereas the naive perl implementation counted it as having 1 name.
  • The initialization behavior of bst-defined, but unassigned variables is tricky. From documentation I read so far I assumed that they would simply be in an undefined (and unusable) state until explicitly assigned. This is in fact not the case, they are actually assigned default values. For integers, that 0. For strings, that's the empty string.

Regarding running biblatex with bibtex backend:

  • Uses a biblatex.bst style file. It is sent parameters using a specialized, and biblatex-generated, @Control .bib file entry. This entry has to appear in the cited entries in first position for anything to work.
  • It produces a .bbl that is semantically similar to using the biber backend. Compare bibtex and biber.
  • I could imagine that instead of trying to re-implement processing using biber, we (for now) only support the bibtex backend, falling back to it even if the biber backend is explicitly requested.
  • I still have to figure out the tex context the generated .bbl is run in. I tried last week, but decided I was somewhat stuck and needed a different front to make progress on.

@tkw1536
Copy link
Copy Markdown
Contributor Author

tkw1536 commented Mar 31, 2025

I'm still hoping a variant of this will get merged at some point, but as nothing has happened in a little over 2 years I'm going to close it.
If there is interest, feel free to re-open, or consider #1955.

@tkw1536 tkw1536 closed this Mar 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants