INDEX
Explanations
the concept of redistribution or movement away from a central point or authority
New Auto-Interp
Negative Logits
avic
-0.17
LoadIdentity
-0.16
ovah
-0.16
lady
-0.15
naires
-0.15
asher
-0.15
acy
-0.15
lại
-0.15
_tac
-0.15
र
-0.15
POSITIVE LOGITS
ward
0.30
away
0.21
oll
0.20
Away
0.19
wards
0.19
-away
0.18
yyyy
0.17
aways
0.17
away
0.16
aland
0.16
Activations Density 0.032%