INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Inscrivez
    -0.43
     berdua
    -0.40
     izvē
    -0.38
    (
    -0.36
     lemah
    -0.36
     is
    -0.35
     dė
    -0.35
     anlaş
    -0.34
     laikā
    -0.34
     Económica
    -0.33
    POSITIVE LOGITS
    AddTagHelper
    0.69
     autorytatywna
    0.64
    setVerticalGroup
    0.64
    RTGC
    0.62
    uxxxx
    0.62
    complexContent
    0.61
     faſt
    0.60
    ftagPool
    0.60
     Verſ
    0.59
    ſelves
    0.58
    Act Density 0.083%

    No Known Activations