INDEX
    Explanations

    concepts related to authenticity and reality in discussions or narratives

    New Auto-Interp
    Negative Logits
    éĻIJ
    -0.17
    pur
    -0.15
    ä¸ĢåĪĩ
    -0.15
    eling
    -0.15
    ester
    -0.15
    owns
    -0.14
    esModule
    -0.14
    .TestCase
    -0.14
    allon
    -0.14
    İ
    -0.13
    POSITIVE LOGITS
     real
    0.20
     truly
    0.17
    -real
    0.16
     Proper
    0.16
    auc
    0.16
    (real
    0.16
    ylan
    0.15
    yan
    0.15
    OMB
    0.15
     Truly
    0.15
    Act Density 0.153%

    No Known Activations