INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Frederick
    -0.08
     Sharia
    -0.07
     nir
    -0.07
    jes
    -0.07
    της
    -0.07
                                    
    -0.07
     Страна
    -0.07
    ูม
    -0.06
     [...]
    -0.06
    WindowText
    -0.06
    POSITIVE LOGITS
    0.07
     Mighty
    0.07
     In
    0.06
     in
    0.06
    0.06
     commits
    0.06
     Daly
    0.06
     BEGIN
    0.06
     IN
    0.06
     Nel
    0.06
    Act Density 0.015%

    No Known Activations