INDEX
    Explanations

    references to personal experiences and emotional responses

    New Auto-Interp
    Negative Logits
     we
    -0.89
     I
    -0.80
     you
    -0.77
     he
    -0.74
     We
    -0.69
     i
    -0.67
     You
    -0.66
    -0.65
     O
    -0.63
     He
    -0.63
    POSITIVE LOGITS
    abestanden
    1.24
    AndEndTag
    1.22
    ReusableCell
    1.21
    enumii
    1.20
    Tikang
    1.19
     للاسماء
    1.19
     myſelf
    1.16
     дописавши
    1.16
    TagMode
    1.15
     bezeichneter
    1.13
    Act Density 0.147%

    No Known Activations