INDEX
    Explanations

    unexpected qualities or outcomes

    New Auto-Interp
    Negative Logits
     religione
    0.71
    HERE
    0.71
     pergunta
    0.68
    Г
    0.67
    नए
    0.66
     बीमारी
    0.66
     milioni
    0.66
    Κ
    0.66
    Λ
    0.66
    nal
    0.65
    POSITIVE LOGITS
     disappointing
    0.77
     disappointed
    0.75
     underwhelming
    0.72
     disappointment
    0.70
     surprisingly
    0.68
     surprised
    0.67
    ándo
    0.64
     unexpectedly
    0.64
     surprising
    0.63
     disgusting
    0.63
    Act Density 0.121%

    No Known Activations