INDEX
    Explanations

    headings and labeled sections indicating organization or categorization

    New Auto-Interp
    Negative Logits
    lyn
    -0.16
    chen
    -0.14
    lik
    -0.13
    akis
    -0.13
     Rings
    -0.13
    šak
    -0.13
    aven
    -0.13
    ull
    -0.13
    .spatial
    -0.13
    ano
    -0.13
    POSITIVE LOGITS
    :↵
    0.25
    :↵↵
    0.23
     :↵
    0.21
     :↵↵
    0.19
    :↵↵↵
    0.18
    ofire
    0.17
    :č↵
    0.16
    pNet
    0.16
    á»Ļc
    0.15
    çķ
    0.15
    Act Density 0.071%

    No Known Activations