INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Hussein
    -0.84
     historische
    -0.79
     takich
    -0.79
    kyo
    -0.77
    他们
    -0.77
     gyere
    -0.76
    ______
    -0.76
    ____
    -0.73
    htub
    -0.73
    циали
    -0.73
    POSITIVE LOGITS
     these
    1.02
    0.87
    ابه
    0.85
    0.84
    limsy
    0.84
    These
    0.83
    0.83
    0.82
     placing
    0.80
     Njema
    0.80
    Act Density 0.007%

    No Known Activations