INDEX
    Explanations

    prepositions and following words

    New Auto-Interp
    Negative Logits
     ENG
    0.68
     даже
    0.62
     bahkan
    0.62
     BE
    0.60
     résultats
    0.59
     zelfs
    0.59
     estadísticas
    0.59
    avip
    0.57
     dai
    0.56
     high
    0.55
    POSITIVE LOGITS
    </h2>
    0.75
    ;
    0.69
    >
    0.66
    :
    0.61
    0.59
    ↵↵↵
    0.58
    </b>
    0.58
    </h3>
    0.57
    -=
    0.56
    «
    0.56
    Act Density 0.029%

    No Known Activations