INDEX
Explanations
themes related to societal control and totalitarianism
New Auto-Interp
Negative Logits
indsight
-0.15
ãĥ¼ãĥĩ
-0.14
ore
-0.14
ichi
-0.14
алÑİ
-0.14
perd
-0.14
ometown
-0.13
edom
-0.13
aran
-0.13
ToDevice
-0.13
POSITIVE LOGITS
hoa
0.17
perfect
0.16
Quar
0.15
quint
0.15
_connector
0.14
abol
0.14
masses
0.14
rise
0.14
گاÙĨÛĮ
0.14
Benjamin
0.14
Activations Density 0.315%