INDEX
    Explanations

    phrases indicating conditions or exceptions

    New Auto-Interp
    Negative Logits
    })`
    -0.56
    уда
    -0.52
    stwie
    -0.51
     userDao
    -0.50
     pict
    -0.50
    Più
    -0.50
     laiko
    -0.50
     ſtate
    -0.50
     bacio
    -0.50
     interesan
    -0.50
    POSITIVE LOGITS
    MessageTagHelper
    0.72
     ведь
    0.70
    Ведь
    0.69
     totiž
    0.68
     przecież
    0.64
     bowiem
    0.61
    0.60
     '\\;'
    0.57
     kasarigan
    0.57
     فهي
    0.57
    Act Density 0.101%

    No Known Activations