INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     verbal
    -2.05
     verbally
    -1.77
     Verbal
    -1.70
    verbal
    -1.63
    Verbal
    -1.52
    #+#
    -0.96
    зулта
    -0.91
     saites
    -0.89
    \{\\
    -0.87
     חיצוניים
    -0.85
    POSITIVE LOGITS
    ity
    0.71
    ly
    0.68
    ized
    0.65
    ed
    0.63
    ised
    0.60
    es
    0.55
    ism
    0.52
    izing
    0.51
    ise
    0.51
    ization
    0.50
    Act Density 0.097%

    No Known Activations