INDEX
Explanations
verbs related to forceful actions or destruction
actions associated with physical forces and their effects
New Auto-Interp
Negative Logits
ende
-0.72
uben
-0.68
è¦ļéĨĴ
-0.67
ffe
-0.67
obbies
-0.67
****************
-0.63
TPP
-0.63
enhagen
-0.62
morph
-0.60
Legal
-0.60
POSITIVE LOGITS
away
1.23
apart
1.02
down
1.00
them
1.00
off
0.95
attackers
0.92
opponents
0.88
out
0.88
downward
0.88
forth
0.87
Activations Density 0.230%