INDEX
Explanations
words associated with power dynamics and control in societal and historical contexts
increasingly refusing
New Auto-Interp
Negative Logits
'\\;'
-0.46
Życiorys
-0.44
carriers
-0.40
KERN
-0.40
Biôgrafia
-0.40
snippetHide
-0.40
🇶
-0.39
étoient
-0.39
tempts
-0.38
Temper
-0.38
POSITIVE LOGITS
ScreenState
0.45
Administrativna
0.44
nonUne
0.44
脚注の使い方
0.42
informée
0.42
AddTagHelper
0.41
outWeight
0.38
SharedCtor
0.38
marck
0.38
kháu
0.38
Activations Density 0.260%