INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    0.96
     jsou
    0.94
     são
    0.88
     are
    0.88
     ovat
    0.82
    都是
    0.80
     janë
    0.78
     sont
    0.77
     هستند
    0.77
     vannak
    0.77
    POSITIVE LOGITS
     lies
    0.42
     goes
    0.40
    有一
    0.35
     was
    0.34
    няется
    0.34
     stands
    0.33
     Goes
    0.33
     heps
    0.32
     sits
    0.32
     comes
    0.32
    Act Density 0.098%

    No Known Activations