INDEX
    Explanations

    sex differences automatically translate

    New Auto-Interp
    Negative Logits
    Probably
    0.42
     VERY
    0.41
    $(\
    0.40
    そらく
    0.38
     very
    0.38
    Confidence
    0.38
     slightly
    0.37
     সম্ভবত
    0.37
     Confidence
    0.37
    明らかに
    0.37
    POSITIVE LOGITS
     automaticamente
    1.02
     автоматически
    1.02
     necesariamente
    1.01
     automatically
    0.99
     automáticamente
    0.97
     necessarily
    0.95
     necessariamente
    0.93
    automatically
    0.93
     automatiquement
    0.91
     somehow
    0.90
    Act Density 0.075%

    No Known Activations