INDEX
    Explanations

    words related to certainty or confidence

    New Auto-Interp
    Negative Logits
    cial
    -0.76
    idelines
    -0.74
    utch
    -0.74
    jab
    -0.72
    vert
    -0.69
    AK
    -0.67
    enta
    -0.67
    hes
    -0.66
    thinkable
    -0.66
    Rated
    -0.66
    POSITIVE LOGITS
     someday
    0.85
     whoever
    0.83
     Rasmussen
    0.74
     sooner
    0.71
     Admin
    0.70
     readers
    0.67
     none
    0.64
     someone
    0.62
     historians
    0.62
     CCP
    0.61
    Act Density 0.313%

    No Known Activations