INDEX
    Explanations

    phrases indicating absence or negation

    New Auto-Interp
    Negative Logits
    nieuw
    -0.39
    '))
    
    -0.36
     encontramos
    -0.36
    iem
    -0.34
     dulu
    -0.33
     egyszerű
    -0.32
    logisch
    -0.32
     heutigen
    -0.31
     sederhana
    -0.31
    Välislingid
    -0.31
    POSITIVE LOGITS
     mut
    0.85
     ModelRenderer
    0.83
     ujednoznacz
    0.74
     صوتيه
    0.73
    mut
    0.73
    <?
    0.73
     OMITBAD
    0.69
     vessel
    0.68
    /**
    0.67
     without
    0.65
    Act Density 0.249%

    No Known Activations