INDEX
    Explanations

    the word "conf" or its variations at the end of words

    references to confidence and its derivatives

    New Auto-Interp
    Negative Logits
    senal
    -0.94
    hyde
    -0.89
    DAY
    -0.83
    ï¸ı
    -0.75
    Ô
    -0.74
    BALL
    -0.73
    NING
    -0.71
    matically
    -0.69
    OHN
    -0.67
    hao
    -0.66
    POSITIVE LOGITS
    essional
    1.28
    ederation
    1.21
    luence
    1.16
    eder
    1.10
    usions
    1.09
    essor
    1.05
    idences
    1.00
    eree
    0.94
    erences
    0.93
    etti
    0.92
    Act Density 0.006%

    No Known Activations