INDEX
    Explanations

    negative or dismissive expressions regarding military or political topics

    New Auto-Interp
    Negative Logits
    ulling
    -0.16
    ramework
    -0.16
    tics
    -0.15
    prit
    -0.15
    IFORM
    -0.14
    alm
    -0.14
    zsche
    -0.14
    cret
    -0.14
    ahoma
    -0.14
    aster
    -0.13
    POSITIVE LOGITS
    /vendors
    0.16
     Wax
    0.15
     wax
    0.15
     Pere
    0.15
    Ļ
    0.14
    éĢļ
    0.14
    åĥ
    0.14
    -transition
    0.14
     Gro
    0.13
    AdapterFactory
    0.13
    Act Density 0.002%

    No Known Activations