INDEX
    Explanations

    references to claims or allegations regarding deception or misinformation

    New Auto-Interp
    Head Attr Weights
    0:0.16
    1:0.03
    2:0.09
    3:0.04
    4:0.04
    5:0.05
    6:0.15
    7:0.04
    8:0.06
    9:0.23
    10:0.03
    11:0.03
    Negative Logits
    acco
    -3.53
     Hik
    -3.31
    zech
    -3.29
     crim
    -3.24
     coh
    -3.23
     HK
    -3.22
     Ic
    -3.18
     Cooper
    -3.17
    ongh
    -3.17
    iolet
    -3.17
    POSITIVE LOGITS
     Rend
    8.89
    Render
    8.22
     Render
    8.00
    render
    7.47
     render
    7.36
     rendering
    7.25
    rendered
    6.53
     rend
    6.38
     rendered
    6.10
     renders
    6.07
    Act Density 0.001%

    No Known Activations