INDEX
    Explanations

    discrepancies between appearance or assertion and reality

    phrases that highlight the contrast between perception and reality

    New Auto-Interp
    Negative Logits
    uyomi
    -0.70
    inel
    -0.65
    erity
    -0.63
    luster
    -0.63
    rosso
    -0.62
     EDITION
    -0.62
    aunted
    -0.62
    ador
    -0.61
    tackle
    -0.60
    iculty
    -0.60
    POSITIVE LOGITS
     indeed
    0.90
     untrue
    0.81
     actually
    0.79
     meant
    0.77
     intended
    0.76
     true
    0.72
    actually
    0.71
    REDACTED
    0.70
     existed
    0.68
     factual
    0.68
    Act Density 0.973%

    No Known Activations