INDEX
    Explanations

    mentions of the word "New" and related contexts

    New Auto-Interp
    Negative Logits
    ulary
    -0.17
    wert
    -0.15
    pler
    -0.14
     nouvelle
    -0.14
    oji
    -0.14
    ão
    -0.14
     mỼi
    -0.13
    iki
    -0.13
    anje
    -0.13
    NEW
    -0.13
    POSITIVE LOGITS
     study
    0.23
    sp
    0.22
    est
    0.21
     report
    0.21
    ark
    0.21
     Yorkers
    0.20
    study
    0.20
     figures
    0.19
     York
    0.19
     Study
    0.19
    Act Density 0.044%

    No Known Activations