INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Trent
    -0.08
    ici
    -0.08
     XII
    -0.07
    143
    -0.07
     Det
    -0.06
     الکترون
    -0.06
    144
    -0.06
     گست
    -0.06
     Erie
    -0.06
     intellect
    -0.06
    POSITIVE LOGITS
     shows
    0.18
     show
    0.18
     showed
    0.17
     shown
    0.15
     showing
    0.15
     Show
    0.14
    shows
    0.13
    Show
    0.13
     Showing
    0.12
     Shows
    0.12
    Act Density 0.075%

    No Known Activations