INDEX
    Explanations

    mentions of names followed by numbers or the abbreviation 'ON'

    instances of the word "on."

    New Auto-Interp
    Negative Logits
     cow
    -0.61
     Liberties
    -0.59
     eg
    -0.58
     looking
    -0.58
     refuge
    -0.58
     revenge
    -0.57
     functioning
    -0.57
     median
    -0.55
     fam
    -0.55
     shopping
    -0.55
    POSITIVE LOGITS
    ON
    3.94
    ONS
    2.72
    ons
    1.96
    ONY
    1.93
    OND
    1.88
    on
    1.76
    ONES
    1.59
    ONE
    1.58
    ONT
    1.50
    OFF
    1.45
    Act Density 0.010%

    No Known Activations