INDEX
    Explanations

    occurrences of references to people and their respective identifiers

    New Auto-Interp
    Negative Logits
    åıį
    -0.18
    WER
    -0.16
    hardt
    -0.16
    NewProp
    -0.15
    ubby
    -0.14
    heed
    -0.14
     Xxx
    -0.14
    ioc
    -0.14
    .twig
    -0.14
    ullet
    -0.14
    POSITIVE LOGITS
     warm
    0.15
     Mes
    0.14
     beg
    0.14
     Beg
    0.14
     div
    0.13
     fle
    0.13
     directive
    0.13
     insp
    0.13
     ble
    0.13
    ich
    0.13
    Act Density 0.002%

    No Known Activations