INDEX
    Explanations

    mentions of specific locations or organizations

    phrases indicating location and presence of entities or events

    New Auto-Interp
    Negative Logits
    FILE
    -0.65
     concurrently
    -0.63
    coon
    -0.63
    oldown
    -0.63
     reperto
    -0.62
     whereas
    -0.62
     suspic
    -0.61
    */(
    -0.61
     favorably
    -0.60
    quez
    -0.60
    POSITIVE LOGITS
     ours
    0.94
     Patreon
    0.80
     Blog
    0.79
     HuffPost
    0.74
     Ao
    0.74
     Subtle
    0.72
     EW
    0.72
     LW
    0.71
     Pod
    0.71
     Bearing
    0.71
    Act Density 0.086%

    No Known Activations