INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     railway
    -0.07
     google
    -0.07
     nav
    -0.07
     selector
    -0.07
     nodeName
    -0.07
     Bert
    -0.07
     Lange
    -0.06
    =context
    -0.06
    de
    -0.06
    design
    -0.06
    POSITIVE LOGITS
    𝗢
    0.07
    قه
    0.07
    че
    0.07
    פה
    0.07
    orda
    0.07
    productive
    0.07
     mosquitoes
    0.07
    nicos
    0.06
    acimiento
    0.06
     embroidered
    0.06
    Act Density 0.015%

    No Known Activations