INDEX
    Explanations

    descriptions of physical impacts or injuries

    New Auto-Interp
    Negative Logits
    saraba
    -0.59
     mengan
    -0.46
    ovací
    -0.44
     behalten
    -0.44
    glow
    -0.42
    Glow
    -0.41
    stdbool
    -0.41
     Kep
    -0.40
     توا
    -0.40
     poisoned
    -0.40
    POSITIVE LOGITS
     rough
    0.90
     gentle
    0.83
     forceful
    0.81
     careless
    0.81
     gynhyrchwyd
    0.78
     jost
    0.78
     Rough
    0.78
     clumsy
    0.77
     gentleness
    0.77
     mish
    0.76
    Act Density 0.340%

    No Known Activations