INDEX
    Explanations

    references to specific individuals and their associations

    New Auto-Interp
    Negative Logits
     ff
    -0.68
     ns
    -0.68
     cc
    -0.66
     dd
    -0.66
     f
    -0.64
     rs
    -0.64
     ks
    -0.64
     lt
    -0.63
     gs
    -0.63
     nt
    -0.62
    POSITIVE LOGITS
     démocr
    0.86
     مشين
    0.84
     vastaan
    0.84
     énergé
    0.82
     supérieurs
    0.81
     colorés
    0.81
     réguli
    0.80
     palvel
    0.79
     normaux
    0.78
     innamor
    0.76
    Act Density 0.374%

    No Known Activations