INDEX
    Explanations

    instances of negation, specifically the word "not."

    New Auto-Interp
    Negative Logits
    arpa
    -0.47
    Hymen
    -0.46
     Krieges
    -0.45
    oplasma
    -0.45
    Mga
    -0.44
     MethodInfo
    -0.44
     savent
    -0.44
    ثيق
    -0.43
    arakhand
    -0.43
    “……”
    -0.43
    POSITIVE LOGITS
     philanth
    0.90
     hairc
    0.82
     vectra
    0.81
     vhs
    0.80
     shenan
    0.79
     necessari
    0.79
     ktm
    0.79
    ikkert
    0.78
     toshiba
    0.77
     necessarie
    0.75
    Act Density 0.222%

    No Known Activations