INDEX
    Explanations

    references to specific names, potentially related to people or places

    references to popular figures in entertainment, specifically late-night hosts

    New Auto-Interp
    Negative Logits
     pus
    -0.70
     cm
    -0.69
     loudspe
    -0.66
     fingerprint
    -0.65
     spe
    -0.64
     mer
    -0.61
     printing
    -0.61
     symbolic
    -0.61
     supervised
    -0.60
     acceler
    -0.60
    POSITIVE LOGITS
     Fallon
    4.24
     Kimmel
    1.63
    Fall
    1.16
     Finn
    1.00
     Newport
    0.98
     Grande
    0.95
    vati
    0.94
     Downing
    0.94
     Fiona
    0.93
    raltar
    0.90
    Act Density 0.016%

    No Known Activations