INDEX
    Explanations

    references to sports teams and their interaction with fans

    New Auto-Interp
    Negative Logits
    ”?
    -0.15
    tero
    -0.15
    -0.13
    ”—
    -0.13
    ÑĢоп
    -0.13
    оÑĩно
    -0.13
     поки
    -0.13
    -0.13
    [](
    -0.13
    -0.13
    POSITIVE LOGITS
     basically
    0.28
     I
    0.25
     but
    0.25
     really
    0.25
     -
    0.24
     because
    0.24
     obviously
    0.24
     whereas
    0.23
     yeah
    0.23
    .↵
    0.22
    Act Density 0.093%

    No Known Activations