INDEX
    Explanations

    phrases suggesting doubts or questioning perceptions

    Appearing or perception not being reality

    appearance versus reality

    New Auto-Interp
    Negative Logits
    AndEndTag
    -0.46
    ConstraintMaker
    -0.45
    الحياه
    -0.44
    AnchorStyles
    -0.43
     preventative
    -0.42
     defaultstate
    -0.42
    المناصب
    -0.41
    erapeutics
    -0.41
    ]")]
    -0.41
    SequentialGroup
    -0.41
    POSITIVE LOGITS
     decep
    0.47
    以为
    0.45
    以為
    0.45
    EDEFAULT
    0.45
     superfic
    0.42
     misconception
    0.40
     facade
    0.39
     misconceptions
    0.39
     deceiving
    0.39
    看似
    0.38
    Act Density 0.380%

    No Known Activations