INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    m
    0.47
    larda
    0.46
    abhuto
    0.45
    IMATE
    0.45
    ாள்
    0.45
    дите
    0.43
    larıyla
    0.43
    туу
    0.41
    та
    0.41
    ാവ
    0.41
    POSITIVE LOGITS
     verdadeira
    0.81
     true
    0.74
     True
    0.69
     verdadera
    0.68
     TRUE
    0.66
     verdadero
    0.66
     verdadeiro
    0.64
    True
    0.61
     ट्रू
    0.61
     believers
    0.60
    Act Density 0.022%

    No Known Activations