INDEX
Explanations
phrases relating to considering, evaluating, or giving opinions on various topics
New Auto-Interp
Negative Logits
ICLE
-0.68
eer
-0.68
Gutenberg
-0.68
nered
-0.66
Broken
-0.63
BLE
-0.61
Brist
-0.61
chief
-0.60
Kart
-0.60
etheless
-0.60
POSITIVE LOGITS
weigh
1.10
weighing
0.89
weights
0.84
ighed
0.84
heaviest
0.82
weighed
0.78
heavily
0.77
down
0.75
heavier
0.74
enance
0.73
Activations Density 0.044%