INDEX
    Explanations

    proper nouns, specifically names of people

    occurrences of the word "More" and its variations in different contexts

    New Auto-Interp
    Negative Logits
    IPS
    -0.68
    oes
    -0.66
    Cro
    -0.66
    liest
    -0.64
    %]
    -0.62
    oresc
    -0.62
    ividual
    -0.61
    OCK
    -0.61
    IP
    -0.61
    keeping
    -0.60
    POSITIVE LOGITS
     than
    1.28
     Than
    0.93
     importantly
    0.91
    HUD
    0.84
     likely
    0.79
    than
    0.79
    ened
    0.78
     closely
    0.78
     ado
    0.78
    models
    0.74
    Act Density 0.102%

    No Known Activations