INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     one
    -0.08
    تمر
    -0.07
    -0.07
     setbacks
    -0.07
     Państ
    -0.07
     ובכל
    -0.07
    altet
    -0.07
     stumble
    -0.06
     dictatorship
    -0.06
     One
    -0.06
    POSITIVE LOGITS
    -inspired
    0.07
    🗝
    0.07
    👆
    0.07
     Meaning
    0.07
     Phon
    0.07
     ClassName
    0.07
     פוס
    0.07
    {
    ↵
    0.07
    ispens
    0.06
     Ethernet
    0.06
    Act Density 0.101%

    No Known Activations