INDEX
    Explanations

    references to historical dates and temporal contexts

    New Auto-Interp
    Negative Logits
     tevens
    -0.56
     său
    -0.53
     nemlig
    -0.49
     deoarece
    -0.48
     např
    -0.48
    幸いです
    -0.48
     blijkt
    -0.48
     znacznie
    -0.47
     dezelve
    -0.47
     aldus
    -0.46
    POSITIVE LOGITS
     somebody
    0.61
     gonna
    0.60
    somebody
    0.60
    Somebody
    0.57
     everybody
    0.57
     really
    0.56
    really
    0.54
     Somebody
    0.52
     fucking
    0.52
     guys
    0.52
    Act Density 0.876%

    No Known Activations