INDEX
    Explanations

    words or phrases related to academic or educational settings

    New Auto-Interp
    Negative Logits
    anke
    -0.18
    anst
    -0.17
    rze
    -0.15
    stype
    -0.14
    LOPT
    -0.14
    ipel
    -0.14
    жи
    -0.14
    476
    -0.14
    akis
    -0.14
    abay
    -0.13
    POSITIVE LOGITS
     gad
    0.17
    ohl
    0.15
    fid
    0.14
    robe
    0.14
     EQUI
    0.14
     restless
    0.14
    agon
    0.14
    à¤ĵ
    0.14
    /Instruction
    0.14
    hee
    0.13
    Act Density 0.016%

    No Known Activations