INDEX
Explanations
concepts related to social theory and ideology
New Auto-Interp
Negative Logits
ức
-0.19
ÑĢовиÑĩ
-0.17
igs
-0.14
ustomer
-0.14
essian
-0.14
gth
-0.14
askell
-0.14
ioni
-0.14
onga
-0.14
Äįem
-0.14
POSITIVE LOGITS
sin
0.16
sine
0.16
punk
0.15
sin
0.15
unto
0.15
.ud
0.14
enburg
0.14
Ïħμ
0.14
Sight
0.14
Snap
0.14
Activations Density 0.438%