INDEX
    Explanations

    dominance and submission

    New Auto-Interp
    Negative Logits
     daemon
    1.56
     glycolysis
    1.52
     scalp
    1.45
     bioge
    1.42
     HeLa
    1.42
    队伍
    1.41
    ties
    1.40
     crusade
    1.38
     triage
    1.38
     dizziness
    1.37
    POSITIVE LOGITS
    ت
    2.22
    ў
    2.08
    не
    2.02
    ه
    1.97
    ء
    1.87
    с
    1.84
    i
    1.84
    д
    1.82
    ut
    1.79
    бо
    1.77
    Act Density 0.005%

    No Known Activations