INDEX
    Explanations

    proper names of people

    mentions of specific individuals, particularly those named Nicholas and Christina

    New Auto-Interp
    Negative Logits
    stall
    -1.09
    marked
    -0.94
    alling
    -0.91
    views
    -0.90
    ebook
    -0.88
    rior
    -0.85
    aby
    -0.83
    gress
    -0.83
    front
    -0.81
    ishing
    -0.81
    POSITIVE LOGITS
    ource
    0.79
     Hernandez
    0.79
    aurus
    0.76
     Briggs
    0.74
    terday
    0.74
    hift
    0.73
     Celest
    0.73
     Isa
    0.72
     Gustav
    0.72
     Bender
    0.72
    Act Density 0.035%

    No Known Activations