INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    л
    1.39
    с
    1.25
    м
    1.16
    ين
    1.05
    ш
    1.05
    س
    1.01
    ви
    0.96
    х
    0.96
    з
    0.95
     в
    0.91
    POSITIVE LOGITS
    Fire
    1.09
     fire
    1.05
    0.97
     Fire
    0.94
     a
    0.82
    P
    0.82
    Session
    0.81
    א
    0.78
    H
    0.77
    c
    0.76
    Act Density 0.010%

    No Known Activations