INDEX
    Explanations

    proper nouns, particularly names of individuals

    New Auto-Interp
    Negative Logits
    Wunused
    -0.15
       
    -0.15
    pto
    -0.14
    urrent
    -0.14
    OMIC
    -0.14
     Kerry
    -0.14
     Trouble
    -0.14
     hoops
    -0.13
    pard
    -0.13
    entina
    -0.13
    POSITIVE LOGITS
     Phil
    0.24
    phia
    0.21
    Phil
    0.19
     phil
    0.18
     Philip
    0.18
     Phill
    0.16
    _ph
    0.15
    ãĥĢãĥ¼
    0.15
    bourg
    0.15
     Ph
    0.15
    Act Density 0.024%

    No Known Activations