INDEX
Explanations
the word "take" in various contexts
New Auto-Interp
Negative Logits
Smile
-0.66
agre
-0.66
Cong
-0.65
holm
-0.64
accompanies
-0.63
ingen
-0.60
ese
-0.60
lich
-0.60
idding
-0.60
dissatisf
-0.59
POSITIVE LOGITS
advantage
1.31
aways
1.24
aback
1.01
care
1.00
precedence
0.96
refuge
0.96
heed
0.94
precautions
0.90
pains
0.87
liberties
0.84
Activations Density 0.444%