INDEX
    Explanations

    references or citations in a document

    New Auto-Interp
    Negative Logits
    All
    -0.53
    Under
    -0.51
    The
    -0.49
    At
    -0.49
    ↵↵
    -0.47
     Under
    -0.47
    dorfer
    -0.47
    -
    -0.47
    O
    -0.47
    E
    -0.47
    POSITIVE LOGITS
    UserScript
    0.85
    ſelf
    0.79
    ſelves
    0.76
    DockStyle
    0.75
    oa̍t
    0.75
     mergeFrom
    0.75
     Forumite
    0.74
     itſelf
    0.71
     Efq
    0.71
     iſt
    0.70
    Act Density 0.018%

    No Known Activations