INDEX
    Explanations

    verbs and prepositions that indicate a process or transformation

    New Auto-Interp
    Negative Logits
    logan
    -0.18
    BUM
    -0.16
    äs
    -0.16
    κÎŃ
    -0.16
    ekim
    -0.15
    quip
    -0.15
    онов
    -0.15
    quan
    -0.15
    iri
    -0.15
    han
    -0.14
    POSITIVE LOGITS
    ey
    0.18
    .datab
    0.16
    ady
    0.16
    eyJ
    0.16
    elic
    0.15
    elly
    0.15
    shot
    0.15
    ãĤ·ãĥ¼
    0.14
    its
    0.14
    anian
    0.14
    Act Density 0.002%

    No Known Activations