INDEX
    Explanations

    instances of the word "I" and expressions of personal opinions or experiences

    New Auto-Interp
    Negative Logits
    Ŀ
    -0.17
    opic
    -0.16
    338
    -0.15
    -anchor
    -0.14
     Gy
    -0.14
    ãĥ«ãĥī
    -0.14
     discrimination
    -0.14
     Rodney
    -0.14
     gy
    -0.14
    Mac
    -0.14
    POSITIVE LOGITS
     PLL
    0.33
    PLL
    0.29
     Tro
    0.26
    pll
    0.25
     Pretty
    0.25
    Tro
    0.24
     pll
    0.24
     tro
    0.23
    Pretty
    0.22
     Hanna
    0.21
    Act Density 0.002%

    No Known Activations