INDEX
    Explanations

    discussions about political arguments and hypocrisy

    New Auto-Interp
    Negative Logits
    ç¥Ŀ
    -0.13
    alat
    -0.13
    crud
    -0.12
    .secret
    -0.12
    zeich
    -0.12
     clearfix
    -0.12
    erule
    -0.12
    lesai
    -0.12
    ÑĢиÑĩ
    -0.12
    883
    -0.11
    POSITIVE LOGITS
     argument
    0.75
     arguments
    0.75
     Argument
    0.65
    argument
    0.65
    arguments
    0.63
     Arguments
    0.63
    Argument
    0.61
     arg
    0.59
    Arguments
    0.57
     argue
    0.56
    Act Density 1.009%

    No Known Activations