INDEX
    Explanations

    the presence of specific named entities, particularly names and titles

    New Auto-Interp
    Negative Logits
    hof
    -0.73
    iage
    -0.72
     Lans
    -0.70
     Hole
    -0.69
     Osw
    -0.69
    mares
    -0.65
     Chero
    -0.65
    engers
    -0.64
     reminders
    -0.63
     Beg
    -0.63
    POSITIVE LOGITS
    âĸijâĸij
    1.25
    女
    1.22
    ption
    1.06
    éĹ
    1.05
    entric
    1.04
    çĶŁ
    1.02
    LECT
    1.00
    å¹
    0.97
    âĸij
    0.95
    æĪ¦
    0.92
    Act Density 0.001%

    No Known Activations