INDEX
    Explanations

    terminology and definitions related to concepts and communication

    New Auto-Interp
    Negative Logits
    .ali
    -0.17
    lage
    -0.16
     finished
    -0.14
    legg
    -0.14
    kee
    -0.14
    θι
    -0.14
    -lines
    -0.14
    finished
    -0.14
    Finished
    -0.13
    legen
    -0.13
    POSITIVE LOGITS
     USED
    0.15
     Hlav
    0.14
     terms
    0.14
     æ°
    0.14
     Babe
    0.14
     Hoy
    0.14
     Sp
    0.14
    imen
    0.14
     Used
    0.14
     klu
    0.14
    Act Density 0.097%

    No Known Activations