INDEX
    Explanations

    terms related to truth and factual information

    references to the concept of "truth."

    New Auto-Interp
    Negative Logits
    senal
    -0.70
    emetery
    -0.69
    uled
    -0.69
    joining
    -0.68
    ategory
    -0.67
    rotein
    -0.65
    ESH
    -0.65
    avy
    -0.65
    hyde
    -0.65
    igree
    -0.64
    POSITIVE LOGITS
    fully
    1.15
    fulness
    1.06
    telling
    0.92
    dig
    0.90
    ful
    0.87
    lyn
    0.85
    \\\\\\\\
    0.85
    iness
    0.83
     seeker
    0.82
    deal
    0.78
    Act Density 0.023%

    No Known Activations