INDEX
    Explanations

    proper nouns and names, particularly related to movies, actors, politicians, and places

    New Auto-Interp
    Negative Logits
    rane
    -0.73
    mble
    -0.61
     Bowen
    -0.59
    erella
    -0.59
     critically
    -0.59
    aspers
    -0.59
    Ͻ
    -0.58
    ollen
    -0.58
    ãĥ¼ãĥĨ
    -0.57
     Perspect
    -0.56
    POSITIVE LOGITS
     whatsoever
    1.35
     nor
    0.82
    brainer
    0.76
    ody
    0.72
    onsense
    0.71
    */(
    0.71
    hawk
    0.66
     anymore
    0.66
     dime
    0.65
     hesitation
    0.64
    Act Density 0.093%

    No Known Activations