INDEX
    Explanations

    statements of belief, opinion, or emotional expression

    New Auto-Interp
    Negative Logits
    ezier
    -0.16
    jang
    -0.16
    .intellij
    -0.15
    女åŃIJ
    -0.14
     transf
    -0.14
    ç¼
    -0.14
    ÑĢÑıд
    -0.14
     ange
    -0.14
    .libs
    -0.14
    .reducer
    -0.14
    POSITIVE LOGITS
     personally
    0.27
     personal
    0.17
    åĢij
    0.16
     Daly
    0.16
    /cop
    0.15
     лиÑĩ
    0.15
     himself
    0.15
    personal
    0.15
     Personally
    0.15
     Hed
    0.15
    Act Density 0.126%

    No Known Activations