INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    furt
    -0.07
    spawn
    -0.07
     Da
    -0.06
     meisje
    -0.06
    uw
    -0.06
    Lv
    -0.06
     Poland
    -0.06
     Owen
    -0.06
    жд
    -0.06
    ovie
    -0.06
    POSITIVE LOGITS
    .blob
    0.07
     bilinen
    0.06
     Někter
    0.06
     Determin
    0.06
    ...)
    0.06
     solutions
    0.06
    wizard
    0.06
    0.06
    Determin
    0.06
    0.06
    Act Density 0.002%

    No Known Activations