INDEX
    Explanations

    proper nouns, particularly names and titles

    New Auto-Interp
    Negative Logits
    avin
    -0.18
    orsch
    -0.15
    acht
    -0.15
    obao
    -0.15
    roker
    -0.15
    anke
    -0.15
     handlers
    -0.15
     Rig
    -0.14
    eldon
    -0.14
    likes
    -0.14
    POSITIVE LOGITS
    umhur
    0.25
    ICA
    0.19
    ibr
    0.18
    elic
    0.17
    edd
    0.17
    wyn
    0.17
    IBUT
    0.17
    udo
    0.16
    ISR
    0.16
    afari
    0.15
    Act Density 0.021%

    No Known Activations