INDEX
Explanations
references to motion or movement
conjunctions, particularly the word "and" in various contexts
New Auto-Interp
Negative Logits
Lank
-0.67
Shiite
-0.63
:(
-0.62
Yemeni
-0.62
ļéĨĴ
-0.60
Cole
-0.60
Beam
-0.59
Raiders
-0.58
Weekly
-0.57
Cortex
-0.57
POSITIVE LOGITS
rew
0.76
rogens
0.73
congratulate
0.69
inspire
0.69
groom
0.67
yt
0.65
/+
0.64
romeda
0.64
obey
0.63
tremb
0.63
Activations Density 0.230%