INDEX
    Explanations

    expressions of awareness and realization about various subjects

    New Auto-Interp
    Negative Logits
    oret
    -0.15
    mer
    -0.15
    pire
    -0.14
    ł
    -0.14
    ales
    -0.13
    лом
    -0.13
    witch
    -0.13
    roid
    -0.13
    ana
    -0.13
    zik
    -0.13
    POSITIVE LOGITS
     rằng
    0.23
     there
    0.23
     that
    0.22
     bahwa
    0.20
     they
    0.18
    that
    0.18
     дека
    0.16
     it
    0.15
    ©
    0.15
    ãĤ¤ãĤº
    0.15
    Act Density 0.279%

    No Known Activations