INDEX
    Explanations

    intelligence

    New Auto-Interp
    Negative Logits
    Intellectual
    -1.27
    intelligent
    -1.20
     intelligent
    -1.19
    intellectual
    -1.18
    intelligence
    -1.13
     intellectual
    -1.12
     intellectually
    -1.12
     intelligence
    -1.09
     Intellectual
    -1.08
     intelligente
    -1.07
    POSITIVE LOGITS
    ity
    0.96
    ness
    0.70
    0.64
    ized
    0.60
    ism
    0.60
    ization
    0.57
    ly
    0.56
    ITY
    0.56
    ry
    0.54
    ising
    0.53
    Act Density 0.300%

    No Known Activations