INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .man
    -0.08
     sly
    -0.08
     evas
    -0.08
    _man
    -0.08
     rushing
    -0.08
     Lucky
    -0.08
     misch
    -0.07
     vlastní
    -0.07
    -0.07
     Wide
    -0.07
    POSITIVE LOGITS
     decay
    0.13
    Decay
    0.13
     exponentially
    0.13
     gradually
    0.12
    _decay
    0.12
     taper
    0.12
     fading
    0.12
     steadily
    0.11
    0.11
     dwind
    0.11
    Act Density 0.013%

    No Known Activations