INDEX
    Explanations

    phrases indicating uncertainty, such as modal verbs and adverbs that suggest possibility or frequency

    New Auto-Interp
    Negative Logits
    ayet
    -0.17
    ichel
    -0.16
    ifen
    -0.16
    -Compatible
    -0.16
    strup
    -0.15
    .fx
    -0.15
    ettings
    -0.15
    taÅŁ
    -0.14
    ighton
    -0.14
    dik
    -0.14
    POSITIVE LOGITS
     even
    0.28
     sogar
    0.19
     also
    0.19
    even
    0.18
     même
    0.17
    akan
    0.17
     cả
    0.17
     даже
    0.16
     Even
    0.16
    è¿ĺæľī
    0.16
    Act Density 0.094%

    No Known Activations