INDEX
    Explanations

    phrases and discussions about debunking myths or misinformation

    New Auto-Interp
    Negative Logits
    rawn
    -0.16
    acho
    -0.16
    estre
    -0.15
    acons
    -0.14
     Dou
    -0.14
    ảm
    -0.14
    lán
    -0.14
    ito
    -0.14
    á»ĵn
    -0.13
    unn
    -0.13
    POSITIVE LOGITS
     ref
    0.39
     refute
    0.34
     debunk
    0.31
     dispro
    0.31
     bust
    0.29
     dispute
    0.28
     disp
    0.28
     challenge
    0.28
     reb
    0.26
     Disp
    0.26
    Act Density 0.344%

    No Known Activations