INDEX
    Explanations

    words related to reflection and self-assessment

    New Auto-Interp
    Negative Logits
     mouths
    -0.42
     mouth
    -0.41
     Nap
    -0.39
     WARS
    -0.39
    putString
    -0.36
     beef
    -0.36
     Wars
    -0.35
    optString
    -0.35
    StringTo
    -0.33
    wikia
    -0.33
    POSITIVE LOGITS
     reflection
    2.20
     Reflection
    2.00
    reflection
    1.94
     reflections
    1.91
     reflected
    1.84
     reflect
    1.82
    Reflection
    1.79
     reflecting
    1.74
    reflect
    1.71
     Reflect
    1.70
    Act Density 0.052%

    No Known Activations