## Tuesday, April 6, 2010

### BBL to BIB with BibDesk

BibDesk manages my bibliography, just like iTunes manages my music. I have a single iTunes library, and I also have only a single master bibliography. However, when writing a paper, I want to create a new bib file with only the publications cited in the paper. BibTeX nicely creates a bibliographic reference file (*.bbl), but I want a bibliography file (*.bib) in order to be able to open the paper related publications with BibDesk. So, what I needed was a tool to convert the bbl file into a bib file. Fortunately, I found everything for that on my machine: AppleScript, grep, sed and, of course, BibDesk. I wrote a small script which simply creates a new bib file using a source bib file and a selected bbl file. Simply install the script listed below to your script folder (~/Library/Scripts) and name it "Create BIB from BBL.scpt" . Open your master bib file with BibDesk, then select a bbl file and activate the script(from the script menu in the menu bar). It will automatically create a new bib file in the folder of the bbl file, with the same name as the original bbl file (but with the extension bib).

-- Creates a new bib-file from an bbl-file using BibDesk
-- (C) 2010 Jens von Pilgrim
-- Version: 1.1, 20100615

tell application "Finder"
set selectedItems to the selection
end tell

if ((count of selectedItems) = 0) then
display dialog "Please select at least one bbl file" buttons {"OK"}
return
end if

-- retrieve master bib file
set sourceDoc to ""
tell application "BibDesk"
if (count of documents) = 0 then
display dialog "Please open the source bibliography with BibDesk" buttons {"OK"}
return
end if
if (count of documents) > 1 then
set listOfNames to {} as list
repeat with doc in documents
set listOfNames to listOfNames & name of doc
end repeat
set selected to (choose from list listOfNames) as string
repeat with i from 1 to the count of documents
set doc to (item i of documents)
set strDocName to name of doc as string
if (strDocName is equal to selected) then
set sourceDoc to document i
log "source doc set"
end if
end repeat
if sourceDoc is equal to "" then
return
end if
else
set sourceDoc to document 1
end if
end tell

-- log "copy items from master bib file"

-- convert all selected bbl files
repeat with theItem in selectedItems

set theFile to theItem as alias
set posixpath to POSIX path of theFile

if (offset of ".bbl" in posixpath) > 0 then
set destPosixpath to (my rename(posixpath, "bbl", "bib"))
set destFile to POSIX file destPosixpath

tell application "Finder"
if (exists destFile) then
set rep to display dialog "File " & destFile & " already exists. Overwrite?" buttons {"Yes", "No"}
else
end if
end tell

if (not skip) then

-- log "examine bbl file " & quoted form of posixpath

-- actually, this is the most important line:

set allCites to do shell script "grep -E \"\\\\\\\\bibitem[:space:]*(\$[^]]*\$)?[:space:]*{([^}]*)}\" " & (quoted form of posixpath) & " | sed -E \"s/\\\\\\\\bibitem[:space:]*(\$[^]]*\$)?[:space:]*{([^}]*)}/\\2/\""

set numberOfItems to length of paragraphs of allCites
if numberOfItems = 0 then
display dialog "No bibitems found in " & posixpath & ", maybe the file does not contain any bibitems or the search pattern does not recognize your file format." buttons {"Too bad."}
else
set numberOfMissedItems to 0
set missedItems to ""
tell application "BibDesk"
set destDoc to make new document
repeat with cite in paragraphs of allCites
set bibs to search sourceDoc for cite
set bFound to false
repeat with bib in bibs
set strFoundKey to cite key of bib as string
set strCite to cite as string
if strFoundKey is equal to strCite then
--  log strCite & " found, add to new bib"
set newBib to make new publication at end of publications of destDoc
duplicate bib to newBib
set bFound to true
end if
end repeat
if not bFound then
set numberOfMissedItems to numberOfMissedItems + 1
if (numberOfMissedItems > 1) then
set missedItems to missedItems & ", "
end if
set missedItems to missedItems & cite
end if
end repeat
save destDoc in destFile
end tell
if numberOfMissedItems > 0 then
display dialog "Did not found  " & numberOfMissedItems & " out of " & numberOfItems & " items: " & missedItems & "." buttons {"Uups"}
end if
end if
end if
else
display dialog "Can only extract bib entries from BBL file, was " & posixpath
end if
end repeat

-- this sub-routine just comes up with the new name
on rename(item_name, item_ext, new_extension)
set the trimmed_name to text 1 thru -((length of item_ext) + 2) of the item_name
set target_name to (the trimmed_name & "." & new_extension) as string
return the target_name
end rename

If you don't have BibDesk, you might want to have a look at Michael Zhang's Perl Script. Instead of parsing the bbl file, Zhang's script parses the latex source. I prefer parsing the bbl file, as sometimes I use a whole bunch of latex sources, and BibTeX is quite good in gathering all citations into a single bbl file. You may also want to read the comments to Zhang's post for other solutions. If you don't have a master bib file, you may look at http://www.tex.ac.uk/cgi-bin/texfaq2html?label=makebib. There's a perl scrip provided reconstructing a bib file from a bbl file only. Update 2010-06-15: I updated the script as some BBL files were not recognized. If the script doesn't work, you may have a look at your BBL file. At the moment, the script searches for bibitems like "\bibitem{key}" or "\bibitem[abbr]{key}".

Yin said...

"Open your master bib file with BibDesk, then select a bbl file and activate the script"

I saw the script on the menu, but
was unable to follow the above instruction and make the script work.

Any more detailed instruction?

Jens v.P. said...

Yin, I have updated the script. The previous version did only recognize BBL files with bibitems using optional parameters, such as \bibitem[abbr]{key}. I have improved the search expression in order to support bibitems without the optional parameter, such as \bibitem{key}. I also added some info dialogs in case something goes wrong.

Did you succeed in at least creating an empty bib file? Note that you have to install the script into your user's script folder, not in the BibDesk script folder (as the script calls BibDesk, but it uses the current Finder's selection!).

Daniel said...

Hi
Could you modify it so that it looks in more than one master .bib file?
Cheers and thanks for sharing

Jens v.P. said...

@Daniel: Well, I'd figure one would have to add a loop around the search algorithm. That is, after line
86 (you will have to copy the code to an editor with line numbers) add a repeat statement, iterating over the list of all opened (master) bibliographies (and break when bFound is true), the end of this loop would be before line 96. Of course, one would have to replace "sourceDoc", computed in lines 22 to 41 with a list.

Cheers,
Jens

ElmerCat said...

An incredibly useful script and an excellent example for others to learn and build upon. One of the advantages of using BibDesk is that it works so perfectly with AppleScript; using the two together, you can do anything you want with the database. Thank you so much for sharing!

Patrick Kwantum said...

Very, very useful... but it died on me recently. It used to work but now, I get a complaint that there are no items or it does not recognize them. I am pretty sure that no journal is waiting for my >2000 articles/books master bib file, so I wonder if you could track down what happens. I do have the obnoxious "Web of science" {{ and }} problem, but that would seem odd as trouble source.
Here is an example bbl file:
\providecommand*{\mcitethebibliography}{\thebibliography}
\csname @ifundefined\endcsname{endmcitethebibliography}
{\let\endmcitethebibliography\endthebibliography}{}
\begin{mcitethebibliography}{1}
\providecommand*{\natexlab}[1]{#1}
\providecommand*{\mciteSetBstSublistMode}[1]{}
\providecommand*{\mciteSetBstMaxWidthForm}[2]{}
{\def\EndOfBibitem{\unskip.}
{\let\EndOfBibitem\relax}
\providecommand*{\mciteSetBstMidEndSepPunct}[3]{}
\providecommand*{\mciteSetBstSublistLabelBeginEnd}[3]{}
\providecommand*{\EndOfBibitem}{}
\mciteSetBstSublistMode{f}
\mciteSetBstMaxWidthForm{subitem}
{(\emph{\alph{mcitesubitemcount})}
\mciteSetBstSublistLabelBeginEnd{\mcitemaxwidthsubitemform\space}
{\relax}{\relax}

\bibitem[MORTIER \emph{et~al.}({1985})MORTIER, VANGENECHTEN, and
GASTEIGER]{MORTIER_1985}
W.~MORTIER, K.~VANGENECHTEN and J.~GASTEIGER, \emph{JOURNAL OF THE AMERICAN
CHEMICAL SOCIETY}, {1985}, \textbf{107}, {829--835}\relax
\mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}
{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax
\EndOfBibitem
\end{mcitethebibliography}

I would appreciate it if you could have a look!

Patrick Bultinck
Ghent University

Ebbinghaus said...

Hiya,

I hate feeling like a helpless newbie but in the world of Applescripting I guess I am.

With my .bbl file selected in the finder and my BibDesk file open, when I run the script (from the script icon in the menu bar) I get this message:

"No bibitems found in /Users/GrayMatter/. . ./NSF-CHS-sctn00.bbl, maybe the file does not contain any bibitems or the search pattern does not recognize your file format."

I am using "apa" format and that is showing up before my entries at:

\refsection{0}
\sortlist{entry}{apa}

Any ideas? Thanks.

Jens v.P. said...

@Ebbinghaus I never tried that apa format. Have you tried another format?