INDEX
    Explanations

    proper nouns related to specific entities or places

    New Auto-Interp
    Negative Logits
    pedia
    -0.16
    spo
    -0.16
    γα
    -0.15
    alink
    -0.15
     DropIndex
    -0.14
    INES
    -0.14
     fr
    -0.14
    429
    -0.14
     Davidson
    -0.14
    é³
    -0.14
    POSITIVE LOGITS
    timeofday
    0.15
    uez
    0.14
    odom
    0.14
    ек
    0.14
    arton
    0.14
    raud
    0.13
    Ñĥг
    0.13
     подÑģ
    0.13
    ourt
    0.13
    elyn
    0.13
    Act Density 0.007%

    No Known Activations