INDEX
    Explanations

    abstract concepts related to learning and education

    New Auto-Interp
    Negative Logits
    adora
    -0.18
    ador
    -0.17
    APPER
    -0.15
    oj
    -0.15
    442
    -0.14
    408
    -0.14
    342
    -0.14
    aha
    -0.13
    utor
    -0.13
     Ank
    -0.13
    POSITIVE LOGITS
    .)↵↵↵↵
    0.16
     ones
    0.15
     similarly
    0.15
    din
    0.15
    ôm
    0.14
    åŁĭ
    0.14
    puties
    0.14
     Burgess
    0.14
    .pref
    0.14
    ÙĩرÙĩ
    0.14
    Act Density 0.370%

    No Known Activations