INDEX
    Explanations

    mentions of references and citations in a text

    New Auto-Interp
    Negative Logits
    ArgsConstructor
    -0.60
    enumii
    -0.52
    EndContext
    -0.50
    miste
    -0.49
     inapropiados
    -0.49
    istics
    -0.49
     Schulze
    -0.49
     כן
    -0.48
     Angleterre
    -0.48
     Хорошо
    -0.48
    POSITIVE LOGITS
     REF
    0.95
     refs
    0.93
    Refer
    0.89
     Refs
    0.88
     references
    0.87
     refer
    0.84
     References
    0.83
     useRef
    0.83
     Refer
    0.81
     Reference
    0.79
    Act Density 0.339%

    No Known Activations