INDEX
Explanations
statements regarding societal issues, particularly around concepts of consent and inequality
New Auto-Interp
Negative Logits
ogle
-0.16
any
-0.16
maybe
-0.15
igel
-0.15
sometimes
-0.15
kJ
-0.14
capsule
-0.14
alone
-0.14
ikki
-0.14
allen
-0.13
POSITIVE LOGITS
celik
0.16
ami
0.15
ìĿ´ìĹIJ
0.15
CCI
0.14
PERT
0.14
LLLL
0.14
/unit
0.14
Mare
0.14
çļ
0.14
ayed
0.14
Activations Density 0.132%