INDEX
    Explanations

    words broadly related to scientific discourse, often including studies, numbers, and sources

    Expressing viewpoints/emotions

    New Auto-Interp
    Negative Logits
     nahilalakip
    -0.80
     незавершена
    -0.76
     iſt
    -0.69
     تانيه
    -0.67
    ագրություններ
    -0.65
     ſche
    -0.64
     shewn
    -0.63
     Wicidata
    -0.61
     صوتيه
    -0.60
     Cæsar
    -0.60
    POSITIVE LOGITS
     relâche
    0.50
    ,:);
    0.48
    nalités
    0.47
    UserScript
    0.47
    TProtocol
    0.47
     artificiales
    0.46
    argo
    0.46
     đôi
    0.45
    kwds
    0.43
     chắn
    0.42
    Act Density 25.417%

    No Known Activations