INDEX
Explanations
terms related to specific attributes or characteristics
terms related to specificity in various contexts
New Auto-Interp
Negative Logits
rican
-0.76
Bush
-0.74
former
-0.72
=-=-=-=-
-0.72
ruary
-0.70
Jenn
-0.69
kj
-0.68
NER
-0.68
http
-0.68
TPP
-0.68
POSITIVE LOGITS
ities
1.14
ally
1.04
iveness
0.97
izations
0.89
ivity
0.85
ality
0.85
iations
0.84
itarian
0.83
ileged
0.83
ALLY
0.79
Activations Density 0.020%