INDEX
Explanations
references to societal structures and their impact on individuals
New Auto-Interp
Negative Logits
etus
-0.22
ena
-0.17
enus
-0.14
úc
-0.14
stddef
-0.14
pta
-0.14
olk
-0.14
endi
-0.13
memberOf
-0.13
vault
-0.13
POSITIVE LOGITS
phin
0.17
ινÏĮ
0.16
coop
0.16
ataire
0.15
uada
0.14
OBJ
0.14
оÑĢÑıд
0.14
ertas
0.14
opsis
0.14
itize
0.14
Activations Density 0.440%