INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     thrust
    -0.08
     intimately
    -0.08
    -0.08
     скла
    -0.07
    Ful
    -0.07
     compel
    -0.07
    oreo
    -0.07
     fuels
    -0.07
     thym
    -0.07
    -0.07
    POSITIVE LOGITS
     recap
    0.07
    wald
    0.07
     Sus
    0.07
     Spitz
    0.07
     Cox
    0.07
    0.07
     Gavin
    0.07
     servants
    0.07
    ALO
    0.07
     Didier
    0.07
    Act Density 0.004%

    No Known Activations