INDEX
    Explanations

    mentions of football and its related terms

    New Auto-Interp
    Negative Logits
    ewe
    -0.16
    owitz
    -0.15
    blk
    -0.14
     воÑģ
    -0.14
    uckles
    -0.14
    inery
    -0.14
    Activation
    -0.14
     YaÅŁ
    -0.14
    ighborhood
    -0.13
     Trap
    -0.13
    POSITIVE LOGITS
    rosse
    0.17
    nat
    0.15
    avier
    0.15
     follower
    0.14
    275
    0.14
    غر
    0.14
    æį®
    0.14
    isco
    0.14
    ARSE
    0.14
    ัà¸ķ
    0.14
    Act Density 0.016%

    No Known Activations