INDEX
    Explanations

    Modal verbs indicating possibility or necessity

    New Auto-Interp
    Negative Logits
    ilor
    -0.16
    arias
    -0.15
    urt
    -0.15
    alian
    -0.14
     ke
    -0.14
    ylie
    -0.14
    iris
    -0.13
    706
    -0.13
    alia
    -0.13
    yst
    -0.13
    POSITIVE LOGITS
    ville
    0.16
    bach
    0.16
    ij
    0.15
    abolic
    0.15
    ase
    0.15
     vay
    0.14
    ê°¤
    0.14
    Coeff
    0.14
    zelf
    0.14
    Stamp
    0.14
    Act Density 0.117%

    No Known Activations