INDEX
    Explanations

    phrases related to deception and manipulation, particularly in the context of politics and personal relationships

    New Auto-Interp
    Negative Logits
    DotNetBar
    -0.64
     newBuilder
    -0.63
    المناصب
    -0.55
     Wiktionnaire
    -0.55
     unification
    -0.55
     pleaſure
    -0.54
    RunAsync
    -0.53
     Universitaria
    -0.52
     effusion
    -0.52
     crossorigin
    -0.52
    POSITIVE LOGITS
     pretend
    0.74
     pretended
    0.64
     fake
    0.63
     pretending
    0.61
     pretends
    0.59
     faked
    0.56
    tagHelper
    0.53
     giả
    0.52
    0.52
    fake
    0.51
    Act Density 0.403%

    No Known Activations