INDEX
    Explanations

    mentions of England and other UK nations

    New Auto-Interp
    Negative Logits
    omba
    -0.07
    arat
    -0.07
    istrovstvÃŃ
    -0.07
    allas
    -0.07
    pez
    -0.07
    ancial
    -0.06
    кÑĥÑĤ
    -0.06
    illac
    -0.06
     ado
    -0.06
    ãĤ·ãĤ¢
    -0.06
    POSITIVE LOGITS
    esses
    0.07
    ought
    0.07
    icap
    0.06
     Priority
    0.06
     comm
    0.06
    forcing
    0.06
    acet
    0.06
    ĵ
    0.06
    arg
    0.05
     mild
    0.05
    Act Density 0.001%

    No Known Activations