INDEX
    Explanations

    the word "on" used in various contexts

    New Auto-Interp
    Negative Logits
    iring
    -0.16
    olas
    -0.16
    deo
    -0.15
    lez
    -0.15
    vais
    -0.14
    pie
    -0.14
     Wed
    -0.14
    621
    -0.14
    zug
    -0.14
    gate
    -0.14
    POSITIVE LOGITS
    eli
    0.15
    Ñħодим
    0.15
    omba
    0.14
    ubat
    0.14
    ↵↵
    0.14
    ook
    0.14
    isNull
    0.14
     tslib
    0.14
    erus
    0.13
    åħ¼
    0.13
    Act Density 0.211%

    No Known Activations