INDEX
Explanations
phrases that compare or contrast similarities and differences
phrases that emphasize the notion of political context or significance
New Auto-Interp
Negative Logits
shalt
-0.74
interrupted
-0.73
quished
-0.70
hett
-0.68
IAN
-0.67
Collider
-0.67
lost
-0.64
aunder
-0.63
itten
-0.62
istani
-0.62
POSITIVE LOGITS
nature
1.34
origin
1.10
disguise
1.10
appearance
1.07
tone
1.03
principle
1.02
scope
1.01
comparison
0.97
terms
0.97
stature
0.97
Activations Density 0.143%