INDEX
    Explanations

    a specific pattern in Hebrew characters

    non-standard characters or symbols

    New Auto-Interp
    Negative Logits
     mutants
    -0.84
     blacklist
    -0.80
     Borderlands
    -0.79
     factions
    -0.78
     Coul
    -0.71
     orbiting
    -0.71
     overlapping
    -0.71
     impuls
    -0.71
     demos
    -0.70
     unexpectedly
    -0.69
    POSITIVE LOGITS
    à¤
    2.68
    à¥
    2.59
    ा
    2.54
     à¤
    2.17
    à¨
    1.65
    ر
    1.58
    à©
    1.55
    à¦
    1.47
    س
    1.41
    ×Ļ×
    1.40
    Act Density 0.009%

    No Known Activations