INDEX
    Explanations

    titles and academic texts

    New Auto-Interp
    Negative Logits
    joner
    0.47
    prising
    0.43
    igated
    0.43
    ံ့
    0.43
    <unused2169>
    0.42
     bölg
    0.41
    respective
    0.40
    ərə
    0.40
     admirers
    0.40
    ލ
    0.40
    POSITIVE LOGITS
     
    0.46
     puberty
    0.44
     adolescent
    0.38
     B
    0.36
     tuo
    0.35
     cosmology
    0.35
    ότητας
    0.34
     py
    0.34
     TOEFL
    0.34
     Piano
    0.34
    Act Density 0.000%

    No Known Activations