INDEX
Explanations
terms related to authority and power dynamics within societal structures
New Auto-Interp
Negative Logits
ResourceManager
-0.16
IRCLE
-0.15
ÑİваннÑı
-0.15
.eu
-0.14
erli
-0.14
498
-0.14
Çİ
-0.14
umeric
-0.14
ouston
-0.14
.getFont
-0.14
POSITIVE LOGITS
likewise
0.22
dit
0.21
pedig
0.20
ones
0.20
theirs
0.19
Dit
0.18
similarly
0.17
Likewise
0.16
lage
0.16
hers
0.16
Activations Density 0.197%