INDEX
    Explanations

    names of people or entities preceded by a title or username

    instances of proper nouns, often related to names or identities

    New Auto-Interp
    Negative Logits
     Reynolds
    -0.79
    avia
    -0.78
    SK
    -0.75
    319
    -0.74
     diesel
    -0.73
    EV
    -0.73
    DK
    -0.70
    OV
    -0.70
     Interstellar
    -0.70
     viruses
    -0.69
    POSITIVE LOGITS
     Bar
    2.64
    Bar
    2.42
     bar
    2.30
    bar
    2.21
     Bars
    2.05
     BAR
    1.98
     bars
    1.97
    bars
    1.75
     Barber
    1.64
     Barb
    1.44
    Act Density 0.214%

    No Known Activations