INDEX
    Explanations

    references to specific individuals, particularly names starting with the letter 'D'

    New Auto-Interp
    Negative Logits
    æ¡IJ
    -0.18
    avis
    -0.17
    ÑĢÑĥг
    -0.16
    iaz
    -0.16
    ouble
    -0.15
     cour
    -0.15
    uty
    -0.15
    £i
    -0.15
    emons
    -0.14
    گار
    -0.14
    POSITIVE LOGITS
    istrovstvÃŃ
    0.19
    opal
    0.17
    antan
    0.16
     nou
    0.15
    anel
    0.15
    allon
    0.15
    SB
    0.15
    (strict
    0.14
    ktop
    0.14
     blond
    0.14
    Act Density 0.046%

    No Known Activations