INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     backward
    -0.07
     floating
    -0.07
     backwards
    -0.06
    iltro
    -0.06
    -0.06
    ція
    -0.06
     gameplay
    -0.06
    .scrollTop
    -0.06
    vided
    -0.06
     ju
    -0.06
    POSITIVE LOGITS
     freeway
    0.07
     Beetle
    0.06
     Kürt
    0.06
    ああ
    0.06
    Literal
    0.06
    0.06
     UNC
    0.06
    hone
    0.06
     muzzle
    0.06
    jal
    0.06
    Act Density 0.002%

    No Known Activations