INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .Lib
    -0.07
    _CM
    -0.07
    Pla
    -0.07
     establishment
    -0.07
    .sw
    -0.07
    ilename
    -0.07
    ))/
    -0.07
    świ
    -0.07
     cible
    -0.07
     }}"
    -0.06
    POSITIVE LOGITS
     entanto
    0.11
    ажно
    0.10
     infatti
    0.09
     aside
    0.09
     demikian
    0.09
     meanwhile
    0.09
     afikun
    0.09
    торая
    0.08
    情况下
    0.08
     λοιπόν
    0.08
    Act Density 0.135%

    No Known Activations