INDEX
    Explanations

    references to academic journals and scholarly publications

    New Auto-Interp
    Negative Logits
    akk
    -0.17
    isor
    -0.15
    utt
    -0.15
    aged
    -0.14
    ani
    -0.14
    age
    -0.14
    ager
    -0.14
     iddi
    -0.13
    æī±
    -0.13
    itage
    -0.13
    POSITIVE LOGITS
     Journal
    0.34
    Journal
    0.29
     journal
    0.21
    ournal
    0.19
     Cah
    0.18
    boundary
    0.17
     Forum
    0.17
     Riv
    0.17
     Ze
    0.16
     Signs
    0.16
    Act Density 0.040%

    No Known Activations