INDEX
    Explanations

    phrases indicating stability and consistency over time

    New Auto-Interp
    Negative Logits
    Personensuche
    -0.59
     հղումներ
    -0.52
     Suivez
    -0.47
    RUnlock
    -0.46
     sprüche
    -0.46
    InputBorder
    -0.45
    aarrggbb
    -0.45
    الإنجليزية
    -0.45
     queſta
    -0.43
     yarar
    -0.42
    POSITIVE LOGITS
     unchanged
    0.85
     unchanging
    0.71
     identical
    0.61
    変わらない
    0.57
    変わらず
    0.56
     unaltered
    0.54
    same
    0.54
    Same
    0.54
    identical
    0.54
     same
    0.54
    Act Density 1.004%

    No Known Activations