INDEX
    Explanations

    phrases related to action, decision-making, and important concepts

    New Auto-Interp
    Negative Logits
     الرياضيه
    -0.80
     виправивши
    -0.77
    expandindo
    -0.74
    __':
    
    -0.61
     bezeichneter
    -0.61
     gainera
    -0.57
    respectively
    -0.55
    zeczytaj
    -0.55
    شكرا
    -0.55
     boulangerie
    -0.53
    POSITIVE LOGITS
    1.12
    "
    1.03
    0.98
    ’’
    0.94
    "—
    0.91
    ”!
    0.91
    ”—
    0.90
    "!
    0.86
    ”-
    0.85
    ”…
    0.81
    Act Density 0.441%

    No Known Activations