INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    for
    1.52
     for
    1.30
    ک
    1.13
    1
    1.10
    0
    1.10
    وک
    1.07
    1.07
    For
    1.04
    Liu
    1.04
    ،
    1.04
    POSITIVE LOGITS
    1.19
    ,
    0.92
    ів
    0.90
     саме
    0.86
     cautionary
    0.84
     typographical
    0.82
     rational
    0.80
     colloquial
    0.79
    0.79
    も含
    0.77
    Act Density 0.000%

    No Known Activations