INDEX
    Explanations

    phrases related to distinguishing truth from misinformation

    New Auto-Interp
    Negative Logits
    /trunk
    -0.19
     retro
    -0.14
    rych
    -0.14
    tech
    -0.14
    estone
    -0.14
    .basic
    -0.14
     lat
    -0.14
    .modelo
    -0.14
    aise
    -0.13
     Retro
    -0.13
    POSITIVE LOGITS
    ukan
    0.16
     baise
    0.16
     Lies
    0.16
    _GU
    0.16
    wind
    0.15
     íķµ
    0.15
    .Focus
    0.14
     noise
    0.14
    noise
    0.14
    oref
    0.14
    Act Density 0.163%

    No Known Activations