INDEX
    Explanations

    specific names or entities, likely in a professional or academic context

    New Auto-Interp
    Negative Logits
    ...
    -0.22
    ...'
    -0.16
     -
    -0.15
    ...↵
    -0.14
    .googleapis
    -0.14
    ..."↵
    -0.14
    ...↵↵
    -0.14
     ...
    -0.14
    ...]↵↵
    -0.13
    ød
    -0.13
    POSITIVE LOGITS
     Arsenal
    0.42
    Ars
    0.32
     Fans
    0.24
     arsenal
    0.24
     Ars
    0.22
     AFC
    0.22
     Supporters
    0.22
     football
    0.20
     Football
    0.20
     fans
    0.20
    Act Density 0.003%

    No Known Activations