INDEX
    Explanations

    terms related to artificiality and deception

    "Fake" or "artificial" preceding nouns

    fake or artificial things

    New Auto-Interp
    Negative Logits
    {\
    -0.51
     pribadi
    -0.49
    ಂತ
    -0.48
    usiai
    -0.47
     înal
    -0.47
     oamen
    -0.46
     BoxFit
    -0.46
    usercontent
    -0.46
    tamment
    -0.46
    reactivex
    -0.45
    POSITIVE LOGITS
    LookAnd
    0.90
     Fake
    0.80
     fake
    0.73
    Fake
    0.71
     pretending
    0.71
     sembl
    0.71
     pretence
    0.70
     pretends
    0.69
     Mock
    0.69
     pretend
    0.69
    Act Density 0.225%

    No Known Activations