INDEX
    Explanations

    instances of deception and hidden truths in human behavior and societal norms

    New Auto-Interp
    Negative Logits
    quat
    -0.16
    [now
    -0.15
    /REC
    -0.14
    ToPoint
    -0.14
    "class
    -0.14
    locator
    -0.13
    izzas
    -0.13
     å»
    -0.13
    å¡ļ
    -0.13
    //{{
    -0.13
    POSITIVE LOGITS
     underlying
    0.56
     underneath
    0.49
     behind
    0.46
     beneath
    0.42
     hidden
    0.39
     Behind
    0.36
    Behind
    0.34
     Bene
    0.31
    -hidden
    0.31
    hidden
    0.30
    Act Density 0.232%

    No Known Activations