INDEX
    Explanations

    instances of deceitful or false representations of events or actions

    New Auto-Interp
    Negative Logits
     Wiktionnaire
    -0.59
     disambiguazione
    -0.56
    تفصیلات
    -0.54
    Ikus
    -0.52
    +#+
    -0.51
     debout
    -0.51
     Wilber
    -0.50
    igkeit
    -0.50
    ToFit
    -0.49
    ::$_
    -0.49
    POSITIVE LOGITS
     falsely
    1.54
     pretending
    1.52
     pretended
    1.50
     pretend
    1.47
     fake
    1.46
     pretends
    1.43
     false
    1.31
     Fake
    1.28
     illusion
    1.25
    fake
    1.25
    Act Density 0.624%

    No Known Activations