INDEX
    Explanations

    specific words related to academic research, such as "thesis" and "dissertation."

    instances of the words "thesis" and "dissertation."

    New Auto-Interp
    Negative Logits
    icz
    -0.74
    obby
    -0.72
    nels
    -0.72
    gm
    -0.71
    theless
    -0.71
    atility
    -0.69
    tering
    -0.66
    bies
    -0.65
    tn
    -0.65
    outube
    -0.63
    POSITIVE LOGITS
    ertation
    1.28
     thesis
    1.16
    uates
    0.98
     dissertation
    0.90
    ually
    0.87
    pai
    0.83
    iary
    0.79
    doc
    0.76
    endish
    0.73
     doctoral
    0.70
    Act Density 0.010%

    No Known Activations