INDEX
    Explanations

    the presence of articles or quantifiers in various contexts

    New Auto-Interp
    Negative Logits
    ories
    -0.17
    222
    -0.16
    lights
    -0.15
    agal
    -0.15
    th
    -0.15
    Bitte
    -0.15
    illes
    -0.14
    kır
    -0.14
    erville
    -0.14
    ãĥ¼ãĥĬ
    -0.14
    POSITIVE LOGITS
     dozen
    0.27
     hundred
    0.22
     thousand
    0.18
    undred
    0.17
     decade
    0.16
    ught
    0.15
     Vog
    0.14
     Dek
    0.14
     century
    0.14
     doz
    0.14
    Act Density 0.049%

    No Known Activations