INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -shop
    -0.07
     Walker
    -0.07
     Hardy
    -0.07
     Ernst
    -0.06
     calf
    -0.06
     scramble
    -0.06
     Reed
    -0.06
     scrambling
    -0.06
    Ob
    -0.06
     day
    -0.06
    POSITIVE LOGITS
     potential
    0.12
    potential
    0.11
     Potential
    0.10
    Potential
    0.08
     Пот
    0.08
    Pot
    0.08
     Pot
    0.08
     potentially
    0.07
    0.07
    ्थन
    0.07
    Act Density 0.039%

    No Known Activations