Document (#39277)

Author
Gatenby, J.
Thornburg, G.
Weitz, J.
Title
Collected work clustering in WorldCat : three techniques for maintaining records
Source
Code4Lib journal. Issue 30(2015), [http://journal.code4lib.org]
Year
2015
Abstract
WorldCat records are clustered into works, and within works, into content and manifestation clusters. A recent project revisited the clustering of collected works that had been previously sidelined because of the challenges posed by their complexity. Attention was given to both the identification of collected works and to the determination of the component works within them. By extensively analysing cast-list information, performance notes, contents notes, titles, uniform titles and added entries, the contents of collected works could be identified and differentiated so that correct clustering was achieved. Further work is envisaged in the form of refining the tests and weights and also in the creation and use of name/title authority records and other knowledge cards in clustering. There is a requirement to link collected works with their component works for use in search and retrieval.
Content
Vgl.: http://journal.code4lib.org/articles/10963.
Theme
Formalerschließung
Object
WorldCat
FRBR

Similar documents (content)

  1. FictionFinder : a FRBR-based prototype for fiction in WorldCat (o.J.) 0.29
    0.29117522 = sum of:
      0.29117522 = product of:
        0.90992254 = sum of:
          0.031530537 = weight(abstract_txt:into in 3432) [ClassicSimilarity], result of:
            0.031530537 = score(doc=3432,freq=2.0), product of:
              0.06432557 = queryWeight, product of:
                1.0202955 = boost
                3.697102 = idf(docFreq=2993, maxDocs=44421)
                0.01705282 = queryNorm
              0.4901711 = fieldWeight in 3432, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.697102 = idf(docFreq=2993, maxDocs=44421)
                0.09375 = fieldNorm(doc=3432)
          0.03395174 = weight(abstract_txt:work in 3432) [ClassicSimilarity], result of:
            0.03395174 = score(doc=3432,freq=2.0), product of:
              0.06757781 = queryWeight, product of:
                1.04577 = boost
                3.7894108 = idf(docFreq=2729, maxDocs=44421)
                0.01705282 = queryNorm
              0.50240964 = fieldWeight in 3432, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7894108 = idf(docFreq=2729, maxDocs=44421)
                0.09375 = fieldNorm(doc=3432)
          0.09710985 = weight(abstract_txt:manifestation in 3432) [ClassicSimilarity], result of:
            0.09710985 = score(doc=3432,freq=1.0), product of:
              0.1361669 = queryWeight, product of:
                1.0496752 = boost
                7.607123 = idf(docFreq=59, maxDocs=44421)
                0.01705282 = queryNorm
              0.7131678 = fieldWeight in 3432, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.607123 = idf(docFreq=59, maxDocs=44421)
                0.09375 = fieldNorm(doc=3432)
          0.10676526 = weight(abstract_txt:clustered in 3432) [ClassicSimilarity], result of:
            0.10676526 = score(doc=3432,freq=1.0), product of:
              0.14504944 = queryWeight, product of:
                1.0833709 = boost
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.01705282 = queryNorm
              0.7360612 = fieldWeight in 3432, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.09375 = fieldNorm(doc=3432)
          0.08290165 = weight(abstract_txt:titles in 3432) [ClassicSimilarity], result of:
            0.08290165 = score(doc=3432,freq=1.0), product of:
              0.15438846 = queryWeight, product of:
                1.5806713 = boost
                5.727658 = idf(docFreq=392, maxDocs=44421)
                0.01705282 = queryNorm
              0.53696793 = fieldWeight in 3432, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.727658 = idf(docFreq=392, maxDocs=44421)
                0.09375 = fieldNorm(doc=3432)
          0.12814994 = weight(abstract_txt:records in 3432) [ClassicSimilarity], result of:
            0.12814994 = score(doc=3432,freq=5.0), product of:
              0.13817371 = queryWeight, product of:
                1.8314391 = boost
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.01705282 = queryNorm
              0.9274553 = fieldWeight in 3432, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.09375 = fieldNorm(doc=3432)
          0.17762472 = weight(abstract_txt:worldcat in 3432) [ClassicSimilarity], result of:
            0.17762472 = score(doc=3432,freq=1.0), product of:
              0.25659114 = queryWeight, product of:
                2.037769 = boost
                7.3839793 = idf(docFreq=74, maxDocs=44421)
                0.01705282 = queryNorm
              0.69224805 = fieldWeight in 3432, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3839793 = idf(docFreq=74, maxDocs=44421)
                0.09375 = fieldNorm(doc=3432)
          0.25188887 = weight(abstract_txt:works in 3432) [ClassicSimilarity], result of:
            0.25188887 = score(doc=3432,freq=1.0), product of:
              0.51412106 = queryWeight, product of:
                5.768951 = boost
                5.226035 = idf(docFreq=648, maxDocs=44421)
                0.01705282 = queryNorm
              0.4899408 = fieldWeight in 3432, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.226035 = idf(docFreq=648, maxDocs=44421)
                0.09375 = fieldNorm(doc=3432)
        0.32 = coord(8/25)
    
  2. Dwyer, J.: Bibliographic records enhancement : from the drawing board to the catalog screen (1991) 0.17
    0.17247501 = sum of:
      0.17247501 = product of:
        0.862375 = sum of:
          0.096718594 = weight(abstract_txt:titles in 639) [ClassicSimilarity], result of:
            0.096718594 = score(doc=639,freq=1.0), product of:
              0.15438846 = queryWeight, product of:
                1.5806713 = boost
                5.727658 = idf(docFreq=392, maxDocs=44421)
                0.01705282 = queryNorm
              0.6264626 = fieldWeight in 639, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.727658 = idf(docFreq=392, maxDocs=44421)
                0.109375 = fieldNorm(doc=639)
          0.19933274 = weight(abstract_txt:contents in 639) [ClassicSimilarity], result of:
            0.19933274 = score(doc=639,freq=4.0), product of:
              0.1575097 = queryWeight, product of:
                1.5965694 = boost
                5.7852654 = idf(docFreq=370, maxDocs=44421)
                0.01705282 = queryNorm
              1.2655268 = fieldWeight in 639, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.7852654 = idf(docFreq=370, maxDocs=44421)
                0.109375 = fieldNorm(doc=639)
          0.17789604 = weight(abstract_txt:notes in 639) [ClassicSimilarity], result of:
            0.17789604 = score(doc=639,freq=3.0), product of:
              0.16069855 = queryWeight, product of:
                1.61265 = boost
                5.8435345 = idf(docFreq=349, maxDocs=44421)
                0.01705282 = queryNorm
              1.107017 = fieldWeight in 639, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.8435345 = idf(docFreq=349, maxDocs=44421)
                0.109375 = fieldNorm(doc=639)
          0.09455734 = weight(abstract_txt:records in 639) [ClassicSimilarity], result of:
            0.09455734 = score(doc=639,freq=2.0), product of:
              0.13817371 = queryWeight, product of:
                1.8314391 = boost
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.01705282 = queryNorm
              0.68433666 = fieldWeight in 639, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.109375 = fieldNorm(doc=639)
          0.29387036 = weight(abstract_txt:works in 639) [ClassicSimilarity], result of:
            0.29387036 = score(doc=639,freq=1.0), product of:
              0.51412106 = queryWeight, product of:
                5.768951 = boost
                5.226035 = idf(docFreq=648, maxDocs=44421)
                0.01705282 = queryNorm
              0.5715976 = fieldWeight in 639, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.226035 = idf(docFreq=648, maxDocs=44421)
                0.109375 = fieldNorm(doc=639)
        0.2 = coord(5/25)
    
  3. Hickey, T.B.; O'Neill, E.T.; Toves, J.: Experiments with the IFLA Functional Requirements for Bibliographic Records (FRBR) (2002) 0.16
    0.15600675 = sum of:
      0.15600675 = product of:
        0.78003377 = sum of:
          0.029727275 = weight(abstract_txt:into in 2660) [ClassicSimilarity], result of:
            0.029727275 = score(doc=2660,freq=1.0), product of:
              0.06432557 = queryWeight, product of:
                1.0202955 = boost
                3.697102 = idf(docFreq=2993, maxDocs=44421)
                0.01705282 = queryNorm
              0.46213776 = fieldWeight in 2660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.697102 = idf(docFreq=2993, maxDocs=44421)
                0.125 = fieldNorm(doc=2660)
          0.04526899 = weight(abstract_txt:work in 2660) [ClassicSimilarity], result of:
            0.04526899 = score(doc=2660,freq=2.0), product of:
              0.06757781 = queryWeight, product of:
                1.04577 = boost
                3.7894108 = idf(docFreq=2729, maxDocs=44421)
                0.01705282 = queryNorm
              0.6698795 = fieldWeight in 2660, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7894108 = idf(docFreq=2729, maxDocs=44421)
                0.125 = fieldNorm(doc=2660)
          0.1323527 = weight(abstract_txt:records in 2660) [ClassicSimilarity], result of:
            0.1323527 = score(doc=2660,freq=3.0), product of:
              0.13817371 = queryWeight, product of:
                1.8314391 = boost
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.01705282 = queryNorm
              0.95787174 = fieldWeight in 2660, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.125 = fieldNorm(doc=2660)
          0.23683296 = weight(abstract_txt:worldcat in 2660) [ClassicSimilarity], result of:
            0.23683296 = score(doc=2660,freq=1.0), product of:
              0.25659114 = queryWeight, product of:
                2.037769 = boost
                7.3839793 = idf(docFreq=74, maxDocs=44421)
                0.01705282 = queryNorm
              0.9229974 = fieldWeight in 2660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3839793 = idf(docFreq=74, maxDocs=44421)
                0.125 = fieldNorm(doc=2660)
          0.33585185 = weight(abstract_txt:works in 2660) [ClassicSimilarity], result of:
            0.33585185 = score(doc=2660,freq=1.0), product of:
              0.51412106 = queryWeight, product of:
                5.768951 = boost
                5.226035 = idf(docFreq=648, maxDocs=44421)
                0.01705282 = queryNorm
              0.6532544 = fieldWeight in 2660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.226035 = idf(docFreq=648, maxDocs=44421)
                0.125 = fieldNorm(doc=2660)
        0.2 = coord(5/25)
    
  4. Carlyle, A.; Summerlin, J.: Transforming catalog displays : records clustering for works of fiction (2000) 0.14
    0.13961063 = sum of:
      0.13961063 = product of:
        0.6980531 = sum of:
          0.018579546 = weight(abstract_txt:into in 1100) [ClassicSimilarity], result of:
            0.018579546 = score(doc=1100,freq=1.0), product of:
              0.06432557 = queryWeight, product of:
                1.0202955 = boost
                3.697102 = idf(docFreq=2993, maxDocs=44421)
                0.01705282 = queryNorm
              0.2888361 = fieldWeight in 1100, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.697102 = idf(docFreq=2993, maxDocs=44421)
                0.078125 = fieldNorm(doc=1100)
          0.020006256 = weight(abstract_txt:work in 1100) [ClassicSimilarity], result of:
            0.020006256 = score(doc=1100,freq=1.0), product of:
              0.06757781 = queryWeight, product of:
                1.04577 = boost
                3.7894108 = idf(docFreq=2729, maxDocs=44421)
                0.01705282 = queryNorm
              0.29604772 = fieldWeight in 1100, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7894108 = idf(docFreq=2729, maxDocs=44421)
                0.078125 = fieldNorm(doc=1100)
          0.09551734 = weight(abstract_txt:records in 1100) [ClassicSimilarity], result of:
            0.09551734 = score(doc=1100,freq=4.0), product of:
              0.13817371 = queryWeight, product of:
                1.8314391 = boost
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.01705282 = queryNorm
              0.6912844 = fieldWeight in 1100, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.078125 = fieldNorm(doc=1100)
          0.35404256 = weight(abstract_txt:clustering in 1100) [ClassicSimilarity], result of:
            0.35404256 = score(doc=1100,freq=4.0), product of:
              0.36423963 = queryWeight, product of:
                3.433545 = boost
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.01705282 = queryNorm
              0.9720045 = fieldWeight in 1100, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.078125 = fieldNorm(doc=1100)
          0.20990741 = weight(abstract_txt:works in 1100) [ClassicSimilarity], result of:
            0.20990741 = score(doc=1100,freq=1.0), product of:
              0.51412106 = queryWeight, product of:
                5.768951 = boost
                5.226035 = idf(docFreq=648, maxDocs=44421)
                0.01705282 = queryNorm
              0.408284 = fieldWeight in 1100, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.226035 = idf(docFreq=648, maxDocs=44421)
                0.078125 = fieldNorm(doc=1100)
        0.2 = coord(5/25)
    
  5. Smiraglia, R.P.: Knowledge sharing and content genealogy : extensing the "works" model as a metaphor for non-documentary artefacts with case studies of Etruscan artefacts (2004) 0.13
    0.12576826 = sum of:
      0.12576826 = product of:
        0.6288413 = sum of:
          0.013005683 = weight(abstract_txt:into in 3671) [ClassicSimilarity], result of:
            0.013005683 = score(doc=3671,freq=1.0), product of:
              0.06432557 = queryWeight, product of:
                1.0202955 = boost
                3.697102 = idf(docFreq=2993, maxDocs=44421)
                0.01705282 = queryNorm
              0.20218527 = fieldWeight in 3671, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.697102 = idf(docFreq=2993, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3671)
          0.0140043795 = weight(abstract_txt:work in 3671) [ClassicSimilarity], result of:
            0.0140043795 = score(doc=3671,freq=1.0), product of:
              0.06757781 = queryWeight, product of:
                1.04577 = boost
                3.7894108 = idf(docFreq=2729, maxDocs=44421)
                0.01705282 = queryNorm
              0.2072334 = fieldWeight in 3671, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7894108 = idf(docFreq=2729, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3671)
          0.04727867 = weight(abstract_txt:records in 3671) [ClassicSimilarity], result of:
            0.04727867 = score(doc=3671,freq=2.0), product of:
              0.13817371 = queryWeight, product of:
                1.8314391 = boost
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.01705282 = queryNorm
              0.34216833 = fieldWeight in 3671, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3671)
          0.11374701 = weight(abstract_txt:collected in 3671) [ClassicSimilarity], result of:
            0.11374701 = score(doc=3671,freq=1.0), product of:
              0.37059668 = queryWeight, product of:
                3.8721745 = boost
                5.612423 = idf(docFreq=440, maxDocs=44421)
                0.01705282 = queryNorm
              0.30692938 = fieldWeight in 3671, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.612423 = idf(docFreq=440, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3671)
          0.44080552 = weight(abstract_txt:works in 3671) [ClassicSimilarity], result of:
            0.44080552 = score(doc=3671,freq=9.0), product of:
              0.51412106 = queryWeight, product of:
                5.768951 = boost
                5.226035 = idf(docFreq=648, maxDocs=44421)
                0.01705282 = queryNorm
              0.85739636 = fieldWeight in 3671, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                5.226035 = idf(docFreq=648, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3671)
        0.2 = coord(5/25)