INDEX
    Explanations

    statements expressing personal opinions or thoughts

    New Auto-Interp
    Negative Logits
    emailer
    -0.13
     bulundu
    -0.13
    ÙıÙĪØ§
    -0.12
    ãģ®ãģ¯
    -0.12
    ãģĵãģ¨ãģ§
    -0.12
    .pretty
    -0.12
    ellig
    -0.12
    anken
    -0.12
    ãĥ¼ãĥ«
    -0.12
     пÑĥ
    -0.12
    POSITIVE LOGITS
     there
    1.20
    there
    0.98
     There
    0.90
    There
    0.87
     THERE
    0.85
     ÙĩÙĨاÙĥ
    0.72
     theres
    0.60
     dort
    0.51
    .There
    0.44
    "There
    0.43
    Act Density 0.906%

    No Known Activations