INDEX
Explanations
words related to actions or events
instances of the word "took."
New Auto-Interp
Negative Logits
eers
-0.71
david
-0.66
egg
-0.65
ibel
-0.61
ledge
-0.60
---------
-0.59
bombardment
-0.59
Smile
-0.58
Trop
-0.58
collapsing
-0.58
POSITIVE LOGITS
FINE
0.92
Mehran
0.88
arnaev
0.87
osate
0.85
aways
0.84
inka
0.81
raltar
0.81
autions
0.79
ãĤ¤ãĥĪ
0.79
staking
0.79
Activations Density 0.034%