INDEX
    Explanations

    phrases conveying ongoing actions or states of being

    New Auto-Interp
    Negative Logits
    ambi
    -0.15
    grim
    -0.14
    __)
    -0.14
    arel
    -0.13
    esson
    -0.13
    slack
    -0.13
    hoot
    -0.13
    pj
    -0.13
    à¥
    -0.13
    Inverse
    -0.13
    POSITIVE LOGITS
    rosso
    0.15
    że
    0.15
    ÃŃÅ¡
    0.14
    orer
    0.14
    olec
    0.14
    ="{!!
    0.14
    avern
    0.13
    -valid
    0.13
    aklı
    0.13
    eni
    0.13
    Act Density 0.018%

    No Known Activations