INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Stopped
    0.44
    Reve
    0.44
    หยุด
    0.43
     stop
    0.43
     veterans
    0.42
     stoppage
    0.41
    Veteran
    0.41
    ´
    0.41
     veteran
    0.41
     останов
    0.41
    POSITIVE LOGITS
    ittel
    0.42
    appliquer
    0.41
    ంతి
    0.40
    esho
    0.39
    ળા
    0.39
    orns
    0.39
    0.39
    越し
    0.39
    ற்ற
    0.38
    odeficiency
    0.38
    Act Density 0.004%

    No Known Activations