INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    يذ
    0.43
    acki
    0.43
    فس
    0.41
     Abend
    0.39
    ahanglan
    0.38
    Holder
    0.38
     Sau
    0.37
     koj
    0.37
    ADH
    0.36
     temu
    0.36
    POSITIVE LOGITS
    teacher
    0.92
    teachers
    0.91
    🏫
    0.89
    yard
    0.82
    校长
    0.72
    mates
    0.71
    girl
    0.71
     districts
    0.70
     curricula
    0.68
    girls
    0.67
    Act Density 0.012%

    No Known Activations