INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    çuk
    0.48
    领域的
    0.45
    ρών
    0.44
    גת
    0.44
    agascar
    0.43
    یان
    0.43
    roč
    0.42
    <unused94>
    0.42
    ěji
    0.41
    0.41
    POSITIVE LOGITS
     identical
    0.96
     equal
    0.79
     gleiche
    0.75
     same
    0.73
     identique
    0.71
     uguale
    0.70
     equals
    0.69
     identically
    0.69
     dezelfde
    0.69
    identical
    0.69
    Act Density 0.056%

    No Known Activations