INDEX
    Explanations

    the word "except" in various contexts

    New Auto-Interp
    Negative Logits
    isman
    -0.17
    isha
    -0.15
    chten
    -0.14
    aleza
    -0.14
    ulum
    -0.14
    mina
    -0.14
    789
    -0.14
    erosis
    -0.13
    lek
    -0.13
    _$_
    -0.13
    POSITIVE LOGITS
    ing
    0.17
    ablish
    0.16
    oint
    0.16
     thumbs
    0.15
    iro
    0.14
    aÅŁÄ±
    0.14
    enville
    0.14
     Made
    0.14
    æ´²
    0.14
     Evet
    0.14
    Act Density 0.008%

    No Known Activations