INDEX
    Explanations

    references to logical concepts and structures

    New Auto-Interp
    Negative Logits
    kening
    -0.17
    ê²
    -0.17
    argas
    -0.15
    zsche
    -0.15
    ego
    -0.15
    udeau
    -0.14
    etten
    -0.14
    lsen
    -0.14
    ushima
    -0.14
     marg
    -0.14
    POSITIVE LOGITS
    agma
    0.17
    olon
    0.15
     Naming
    0.14
    ãĤ°ãĥ«
    0.14
    ved
    0.14
    ç´Ģ
    0.14
    سÙĪØ¨
    0.14
    oron
    0.14
    aces
    0.13
     Sequential
    0.13
    Act Density 0.049%

    No Known Activations