INDEX
    Explanations

    emphasis or excitement in statements

    New Auto-Interp
    Negative Logits
    éĤ£ç§į
    -0.15
     navÃŃc
    -0.15
     moreover
    -0.14
    .asp
    -0.14
     cả
    -0.14
    >Main
    -0.14
    ataka
    -0.14
     other
    -0.13
    åį´
    -0.13
    į°
    -0.13
    POSITIVE LOGITS
    Hi
    0.39
    Hello
    0.39
     Hello
    0.39
     Hi
    0.38
     hello
    0.35
     hi
    0.34
    Greetings
    0.32
    hello
    0.31
    reetings
    0.30
    hi
    0.29
    Act Density 0.346%

    No Known Activations