INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     footh
    0.42
     has
    0.42
     CG
    0.41
    Mer
    0.41
     thigh
    0.41
     Mer
    0.40
    我が
    0.40
     Timeout
    0.38
     svůj
    0.38
     Laird
    0.38
    POSITIVE LOGITS
    交给
    0.43
    いです
    0.42
    是由
    0.41
     بوده
    0.41
    이고
    0.40
    ovatel
    0.40
     manuss
    0.40
     kidding
    0.38
     ہے
    0.38
     आहेत
    0.38
    Act Density 0.005%

    No Known Activations