INDEX
    Explanations

    mentions of the word "one" or phrases denoting singularity, particularly in contrast to plural terms

    New Auto-Interp
    Negative Logits
    aby
    -0.17
    ronic
    -0.16
    ätt
    -0.15
    ollapsed
    -0.15
    andin
    -0.14
    iaux
    -0.14
    ancia
    -0.14
    ocol
    -0.14
    rab
    -0.14
    alic
    -0.14
    POSITIVE LOGITS
    utherland
    0.17
    iani
    0.16
    vens
    0.15
    urtle
    0.15
    veh
    0.15
    idl
    0.15
    uml
    0.14
    //{{
    0.14
    endar
    0.14
    SCI
    0.14
    Act Density 0.039%

    No Known Activations