INDEX
Explanations
various forms of critique and commentary on societal issues
New Auto-Interp
Negative Logits
ijd
-0.16
Beaver
-0.15
res
-0.15
ikit
-0.14
MetroFramework
-0.14
pline
-0.14
ngle
-0.14
jedn
-0.14
yn
-0.13
ixon
-0.13
POSITIVE LOGITS
aram
0.19
arella
0.17
idata
0.14
AIT
0.14
agan
0.14
mür
0.14
Pg
0.14
Chu
0.14
esan
0.14
åı·
0.14
Activations Density 0.254%