INDEX
Explanations
terms indicating relativity or comparative relationships
New Auto-Interp
Negative Logits
ory
-0.19
ORY
-0.17
nat
-0.15
hunter
-0.15
ermann
-0.15
ancial
-0.15
reme
-0.14
Hund
-0.14
Warren
-0.14
Schwarz
-0.14
POSITIVE LOGITS
Barnett
0.16
857
0.15
relative
0.15
avou
0.15
humidity
0.15
Relative
0.15
gui
0.15
å¾ħ
0.14
fruit
0.14
itudes
0.14
Activations Density 0.028%