INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    kr
    0.38
    西
    0.37
    ussia
    0.34
    ISP
    0.34
    cn
    0.33
    0.32
    hen
    0.31
    isp
    0.31
    zinc
    0.30
    KP
    0.30
    POSITIVE LOGITS
     BE
    0.45
     NL
    0.39
    NL
    0.38
     nl
    0.36
    BE
    0.34
    <0xC2>
    0.31
     nn
    0.30
     FR
    0.30
     IT
    0.30
     BS
    0.29
    Act Density 0.001%

    No Known Activations