INDEX
    Explanations

    references to the name "Justin" or "Bieber."

    New Auto-Interp
    Negative Logits
    ktop
    -0.76
    wark
    -0.74
     extremes
    -0.70
     mileage
    -0.69
    rums
    -0.66
    exempt
    -0.66
     confinement
    -0.66
    ansas
    -0.64
     womb
    -0.63
    unda
    -0.63
    POSITIVE LOGITS
     Bieber
    1.38
     Timber
    1.18
     Trudeau
    1.02
     Vernon
    0.91
     Upton
    0.90
     Wong
    0.85
     Hayward
    0.84
     Bour
    0.84
    Justin
    0.83
    onymous
    0.82
    Act Density 0.006%

    No Known Activations