INDEX
    Explanations

    proper nouns and phrases related to head-to-head comparisons

    references to television shows and their elements

    New Auto-Interp
    Negative Logits
     Riv
    -0.82
    PLA
    -0.80
     FANT
    -0.77
    vette
    -0.77
    iev
    -0.72
     Prot
    -0.69
    Els
    -0.68
    Dex
    -0.67
    VG
    -0.66
     Et
    -0.65
    POSITIVE LOGITS
     Head
    2.08
    Head
    2.04
     Heads
    2.04
     head
    1.97
    head
    1.93
     HEAD
    1.84
     heads
    1.84
    heads
    1.73
    HEAD
    1.72
     tails
    1.60
    Act Density 0.125%

    No Known Activations