INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     کوئی
    0.45
    oucí
    0.43
     Dekker
    0.40
     هیڅ
    0.40
    Política
    0.40
     Moran
    0.39
     Picker
    0.39
    λαδή
    0.39
     pięk
    0.39
     तित
    0.38
    POSITIVE LOGITS
    rees
    0.41
    tree
    0.40
    soph
    0.40
    0.40
     seem
    0.38
    frees
    0.36
     በት
    0.36
    0.35
     obey
    0.35
    `
    0.35
    Act Density 0.001%

    No Known Activations