INDEX
    Explanations

    proper nouns, specifically names of people or places that start with "Th"

    the presence of the name "Thad" or similar variations

    New Auto-Interp
    Negative Logits
     Reloaded
    -0.91
    ãģ®éŃĶ
    -0.91
    ITED
    -0.88
    76561
    -0.87
    assetsadobe
    -0.83
    keeping
    -0.76
    ãĤŃ
    -0.76
    ãģĭ
    -0.75
    915
    -0.74
    Spoiler
    -0.73
    POSITIVE LOGITS
    irteen
    1.08
    reshold
    1.05
    irst
    1.03
    umb
    1.03
    orne
    1.01
    ought
    1.01
    istle
    1.00
    umbnail
    0.98
    orns
    0.98
    orough
    0.97
    Act Density 0.015%

    No Known Activations