INDEX
    Explanations

    mentions of reference points and citations in academic or technical texts

    New Auto-Interp
    Negative Logits
    erna
    -0.18
    ern
    -0.18
    ish
    -0.15
    de
    -0.15
    оз
    -0.14
    ông
    -0.14
     ÎĶια
    -0.14
    ̣
    -0.14
    trad
    -0.14
     Mell
    -0.14
    POSITIVE LOGITS
    izes
    0.20
    /Instruction
    0.17
    andum
    0.17
     NÄĽm
    0.16
    ién
    0.16
    coni
    0.16
    ourcem
    0.16
    /reference
    0.15
    resher
    0.15
    peating
    0.15
    Act Density 0.017%

    No Known Activations