INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     decentral
    0.96
    ↵↵
    0.95
    IA
    0.93
    ER
    0.91
    and
    0.89
     dispel
    0.87
     postulate
    0.87
     refrain
    0.85
     l
    0.84
     insult
    0.84
    POSITIVE LOGITS
    ا
    1.33
    ס
    1.31
    ره
    1.02
    et
    0.96
    بول
    0.95
    0.95
    s
    0.91
    ן
    0.90
    سل
    0.88
    ті
    0.88
    Act Density 0.025%

    No Known Activations