Document (#39467)

Boiger, W.
Entwicklung und Implementierung eines MARC21-MARCXML-Konverters in der Programmiersprache Perl
Perspektive Bibliothek. 4(2015) H.2, S.33-59
Aktuell befinden sich im Datenbestand des gemeinsamen Katalogs des Bibliotheksverbundes Bayern und des Kooperativen Bibliotheksverbundes Berlin-Brandenburg (B3Kat) etwa 25,6 Millionen Titeldatensätze. Die Bayerische Verbundzentrale veröffentlicht diese Daten seit 2011 im Zuge der bayerischen Open-Data-Initiative auf ihrer Webpräsenz. Zu den Nachnutzern dieser Daten gehören die Deutsche Digitale Bibliothek und das Projekt Culturegraph der DNB. Die Daten werden im weitverbreiteten Katalogdatenformat MARCXML publiziert. Zur Erzeugung der XML-Dateien verwendete die Verbundzentrale bis 2014 die Windows-Software MarcEdit. Anfang 2015 entwickelte der Verfasser im Rahmen der bayerischen Referendarsausbildung einen einfachen MARC-21-MARCXML-Konverter in Perl, der die Konvertierung wesentlich erleichert und den Einsatz von MarcEdit in der Verbundzentrale überflüssig macht. In der vorliegenden Arbeit, die zusammen mit dem Konverter verfasst wurde, wird zunächst die Notwendigkeit einer Perl-Implementierung motiviert. Im Anschluss werden die bibliographischen Datenformate MARC 21 und MARCXML beleuchtet und für die Konvertierung wesentliche Eigenschaften erläutert. Zum Schluss wird der Aufbau des Konverters im Detail beschrieben. Die Perl-Implementierung selbst ist Teil der Arbeit. Verwendung, Verbreitung und Veränderung der Software sind unter den Bedingungen der GNU Affero General Public License gestattet, entweder gemäß Version 3 der Lizenz oder (nach Ihrer Option) jeder späteren Version.[Sie finden die Datei mit der Perl-Implementierung in der rechten Spalte in der Kategorie Artikelwerkzeuge unter dem Punkt Zusatzdateien.]

Similar documents (content)

  1. McCallum, S.H.: MARCXML sampler (2005) 0.25
    0.25114256 = sum of:
      0.25114256 = product of:
        3.1392822 = sum of:
          0.058712237 = weight(abstract_txt:marc in 5361) [ClassicSimilarity], result of:
            0.058712237 = score(doc=5361,freq=2.0), product of:
              0.080442056 = queryWeight, product of:
                1.2917415 = boost
                5.5050235 = idf(docFreq=490, maxDocs=44421)
                0.011312234 = queryNorm
              0.7298699 = fieldWeight in 5361, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5050235 = idf(docFreq=490, maxDocs=44421)
                0.09375 = fieldNorm(doc=5361)
          3.08057 = weight(title_txt:marcxml in 5361) [ClassicSimilarity], result of:
            3.08057 = score(doc=5361,freq=1.0), product of:
              0.50524145 = queryWeight, product of:
                4.5782394 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.011312234 = queryNorm
              6.0972233 = fieldWeight in 5361, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.625 = fieldNorm(doc=5361)
        0.08 = coord(2/25)
  2. Anderson, R.; Birbeck, M.; Kay, M.; Livingstone, S.; Loesgen, B.; Martin, D.; Mohr, S.; Ozu, N.; Peat, B.; Pinnock, J.; Stark, P.; Williams, K.: XML professionell : behandelt W3C DOM, SAX, CSS, XSLT, DTDs, XML Schemas, XLink, XPointer, XPath, E-Commerce, BizTalk, B2B, SOAP, WAP, WML (2000) 0.08
    0.07800157 = sum of:
      0.07800157 = product of:
        0.27857706 = sum of:
          0.0068657636 = weight(abstract_txt:software in 1729) [ClassicSimilarity], result of:
            0.0068657636 = score(doc=1729,freq=1.0), product of:
              0.050413575 = queryWeight, product of:
                1.0226047 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.011312234 = queryNorm
              0.13618879 = fieldWeight in 1729, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.03125 = fieldNorm(doc=1729)
          0.016299032 = weight(abstract_txt:ihrer in 1729) [ClassicSimilarity], result of:
            0.016299032 = score(doc=1729,freq=2.0), product of:
              0.07120647 = queryWeight, product of:
                1.2153287 = boost
                5.1793747 = idf(docFreq=679, maxDocs=44421)
                0.011312234 = queryNorm
              0.22889818 = fieldWeight in 1729, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1793747 = idf(docFreq=679, maxDocs=44421)
                0.03125 = fieldNorm(doc=1729)
          0.012277093 = weight(abstract_txt:version in 1729) [ClassicSimilarity], result of:
            0.012277093 = score(doc=1729,freq=1.0), product of:
              0.07427089 = queryWeight, product of:
                1.2412045 = boost
                5.2896495 = idf(docFreq=608, maxDocs=44421)
                0.011312234 = queryNorm
              0.16530155 = fieldWeight in 1729, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2896495 = idf(docFreq=608, maxDocs=44421)
                0.03125 = fieldNorm(doc=1729)
          0.012311509 = weight(abstract_txt:arbeit in 1729) [ClassicSimilarity], result of:
            0.012311509 = score(doc=1729,freq=1.0), product of:
              0.07440963 = queryWeight, product of:
                1.2423632 = boost
                5.2945876 = idf(docFreq=605, maxDocs=44421)
                0.011312234 = queryNorm
              0.16545586 = fieldWeight in 1729, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2945876 = idf(docFreq=605, maxDocs=44421)
                0.03125 = fieldNorm(doc=1729)
          0.018014882 = weight(abstract_txt:daten in 1729) [ClassicSimilarity], result of:
            0.018014882 = score(doc=1729,freq=1.0), product of:
              0.10978414 = queryWeight, product of:
                1.8482021 = boost
                5.250997 = idf(docFreq=632, maxDocs=44421)
                0.011312234 = queryNorm
              0.16409366 = fieldWeight in 1729, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.250997 = idf(docFreq=632, maxDocs=44421)
                0.03125 = fieldNorm(doc=1729)
          0.061964408 = weight(abstract_txt:implementierung in 1729) [ClassicSimilarity], result of:
            0.061964408 = score(doc=1729,freq=1.0), product of:
              0.27533397 = queryWeight, product of:
                3.379706 = boost
                7.201658 = idf(docFreq=89, maxDocs=44421)
                0.011312234 = queryNorm
              0.2250518 = fieldWeight in 1729, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.201658 = idf(docFreq=89, maxDocs=44421)
                0.03125 = fieldNorm(doc=1729)
          0.15084437 = weight(abstract_txt:perl in 1729) [ClassicSimilarity], result of:
            0.15084437 = score(doc=1729,freq=1.0), product of:
              0.5367281 = queryWeight, product of:
                5.2757134 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.011312234 = queryNorm
              0.2810443 = fieldWeight in 1729, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.03125 = fieldNorm(doc=1729)
        0.28 = coord(7/25)
  3. Weinheimer, J.: ¬A visual explanation of the areas defined by AACR2, RDA, ISBD, LC NAF, LC Classification, LC Subject Headings, Dewey Classification, MARC21 : plus a quick look at ISO2709, MARCXML and a version of BIBFRAME (2015) 0.07
    0.06640523 = sum of:
      0.06640523 = product of:
        0.8300654 = sum of:
          0.059922922 = weight(abstract_txt:marc in 3882) [ClassicSimilarity], result of:
            0.059922922 = score(doc=3882,freq=3.0), product of:
              0.080442056 = queryWeight, product of:
                1.2917415 = boost
                5.5050235 = idf(docFreq=490, maxDocs=44421)
                0.011312234 = queryNorm
              0.7449203 = fieldWeight in 3882, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.5050235 = idf(docFreq=490, maxDocs=44421)
                0.078125 = fieldNorm(doc=3882)
          0.7701425 = weight(title_txt:marcxml in 3882) [ClassicSimilarity], result of:
            0.7701425 = score(doc=3882,freq=1.0), product of:
              0.50524145 = queryWeight, product of:
                4.5782394 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.011312234 = queryNorm
              1.5243058 = fieldWeight in 3882, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.15625 = fieldNorm(doc=3882)
        0.08 = coord(2/25)
  4. Standage, T.: Perl : the glue of the Internet (1995) 0.06
    0.06308405 = sum of:
      0.06308405 = product of:
        0.7885506 = sum of:
          0.03432882 = weight(abstract_txt:software in 3457) [ClassicSimilarity], result of:
            0.03432882 = score(doc=3457,freq=1.0), product of:
              0.050413575 = queryWeight, product of:
                1.0226047 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.011312234 = queryNorm
              0.68094397 = fieldWeight in 3457, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.15625 = fieldNorm(doc=3457)
          0.7542218 = weight(abstract_txt:perl in 3457) [ClassicSimilarity], result of:
            0.7542218 = score(doc=3457,freq=1.0), product of:
              0.5367281 = queryWeight, product of:
                5.2757134 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.011312234 = queryNorm
              1.4052215 = fieldWeight in 3457, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.15625 = fieldNorm(doc=3457)
        0.08 = coord(2/25)
  5. Junger, U.; Hapke, T.: Erschließung 2013: Visionen und mögliche Entwicklungen : Bericht über einen Workshop der Facharbeitsgruppe Erschließung und Informationsvermittlung auf der 12. Verbundkonferenz des Gemeinsamen Bibliotheksverbundes am 11. September 2008 in der Staatsbibliothek zu Berlin - Preußischer Kulturbesitz (2008) 0.06
    0.06275537 = sum of:
      0.06275537 = product of:
        0.31377685 = sum of:
          0.019684931 = weight(abstract_txt:unter in 3484) [ClassicSimilarity], result of:
            0.019684931 = score(doc=3484,freq=3.0), product of:
              0.060794424 = queryWeight, product of:
                1.1229641 = boost
                4.785744 = idf(docFreq=1007, maxDocs=44421)
                0.011312234 = queryNorm
              0.32379502 = fieldWeight in 3484, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.785744 = idf(docFreq=1007, maxDocs=44421)
                0.0390625 = fieldNorm(doc=3484)
          0.02176388 = weight(abstract_txt:arbeit in 3484) [ClassicSimilarity], result of:
            0.02176388 = score(doc=3484,freq=2.0), product of:
              0.07440963 = queryWeight, product of:
                1.2423632 = boost
                5.2945876 = idf(docFreq=605, maxDocs=44421)
                0.011312234 = queryNorm
              0.2924874 = fieldWeight in 3484, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.2945876 = idf(docFreq=605, maxDocs=44421)
                0.0390625 = fieldNorm(doc=3484)
          0.03900336 = weight(abstract_txt:daten in 3484) [ClassicSimilarity], result of:
            0.03900336 = score(doc=3484,freq=3.0), product of:
              0.10978414 = queryWeight, product of:
                1.8482021 = boost
                5.250997 = idf(docFreq=632, maxDocs=44421)
                0.011312234 = queryNorm
              0.3552732 = fieldWeight in 3484, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.250997 = idf(docFreq=632, maxDocs=44421)
                0.0390625 = fieldNorm(doc=3484)
          0.10438321 = weight(abstract_txt:bibliotheksverbundes in 3484) [ClassicSimilarity], result of:
            0.10438321 = score(doc=3484,freq=2.0), product of:
              0.21162096 = queryWeight, product of:
                2.0951414 = boost
                8.928879 = idf(docFreq=15, maxDocs=44421)
                0.011312234 = queryNorm
              0.49325553 = fieldWeight in 3484, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.928879 = idf(docFreq=15, maxDocs=44421)
                0.0390625 = fieldNorm(doc=3484)
          0.12894149 = weight(abstract_txt:verbundzentrale in 3484) [ClassicSimilarity], result of:
            0.12894149 = score(doc=3484,freq=2.0), product of:
              0.27888843 = queryWeight, product of:
                2.9457433 = boost
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.011312234 = queryNorm
              0.46234077 = fieldWeight in 3484, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.0390625 = fieldNorm(doc=3484)
        0.2 = coord(5/25)