INDEX
    Explanations

    titles or headings annotated with special symbols

    titles of articles or reports

    New Auto-Interp
    Negative Logits
     shroud
    -0.81
     destro
    -0.74
     range
    -0.73
     sear
    -0.72
     fragmentation
    -0.71
     sofa
    -0.70
     hemor
    -0.69
     Roc
    -0.66
     leap
    -0.66
     neighb
    -0.65
    POSITIVE LOGITS
    ª
    1.27
    ¹
    1.20
    ł
    1.17
    ı
    1.12
    Ĵ
    1.03
    ³
    0.99
    ¡
    0.99
    ij
    0.98
    «
    0.94
    »
    0.93
    Act Density 0.151%

    No Known Activations