INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     are
    1.45
     to
    1.30
    ۹
    1.08
     rivets
    1.01
    وای
    1.00
     welder
    0.99
     maroc
    0.98
     spectrogram
    0.98
     repellent
    0.97
    0.95
    POSITIVE LOGITS
    ى
    1.33
    is
    1.24
    (
    1.21
    م
    1.17
    n
    1.13
    ти
    1.12
    ن
    1.09
    1.08
    1.08
    1.06
    Act Density 0.025%

    No Known Activations