INDEX
    Explanations

    mentions of locations, specifically cities or educational institutions

    New Auto-Interp
    Negative Logits
    Ñħов
    -0.16
    pNet
    -0.15
    coma
    -0.14
    .mvp
    -0.14
    tae
    -0.13
    è¾
    -0.13
    .multipart
    -0.13
    andler
    -0.13
    itchen
    -0.13
    اÙĦÙĥ
    -0.13
    POSITIVE LOGITS
     Rouge
    0.27
     rouge
    0.19
    leurs
    0.18
     clin
    0.16
     Rogue
    0.15
     Bolt
    0.15
    rou
    0.15
    leur
    0.15
    clin
    0.15
    ummer
    0.15
    Act Density 0.007%

    No Known Activations