INDEX
    Explanations

    questions and statements about identity and self-perception

    New Auto-Interp
    Negative Logits
     pleaſure
    -0.73
     fevere
    -0.69
    دانشنامهٔ
    -0.67
     fuper
    -0.62
     fhort
    -0.61
     ―――――
    -0.61
     poffible
    -0.60
    oredCriteria
    -0.60
     againſt
    -0.60
     neceff
    -0.59
    POSITIVE LOGITS
    pezi
    0.55
    KURZBESCHREIBUNG
    0.55
    bodyParser
    0.51
    artifactId
    0.51
     State
    0.50
    +:+
    0.49
    gdala
    0.48
    تقاوى
    0.46
    \{\\
    0.46
    lude
    0.46
    Act Density 0.218%

    No Known Activations