INDEX
    Explanations

    I express opinion/sentiment

    New Auto-Interp
    Negative Logits
    Remark
    0.53
    ®,
    0.48
     зокрема
    0.48
    Notably
    0.47
    ですが
    0.46
    0.46
     ซึ่ง
    0.46
    (),
    0.45
    0.45
    이며
    0.45
    POSITIVE LOGITS
     nowe
    0.56
     didn
    0.54
     get
    0.53
     exclaimed
    0.52
     these
    0.51
     forgot
    0.50
     These
    0.49
     nieuwe
    0.48
     Jetzt
    0.48
     scared
    0.47
    Act Density 0.060%

    No Known Activations