INDEX
Explanations
terms related to societal constructs and their implications
New Auto-Interp
Negative Logits
ileged
-0.20
Ø©
-0.20
lopedia
-0.19
thing
-0.18
ãģĤãģ£ãģŁ
-0.17
ippet
-0.17
thon
-0.17
sburg
-0.17
tures
-0.17
ãĥ³
-0.17
POSITIVE LOGITS
quarters
0.29
bsites
0.28
adays
0.28
etheless
0.27
nger
0.26
tlement
0.25
atre
0.24
west
0.23
ropolitan
0.22
erior
0.21
Activations Density 0.380%