INDEX
Explanations
references to mental health and consent issues
New Auto-Interp
Negative Logits
akis
-0.21
lug
-0.17
lug
-0.15
Downing
-0.14
lush
-0.14
aka
-0.14
ká
-0.13
esco
-0.13
ans
-0.13
rey
-0.13
POSITIVE LOGITS
owler
0.15
fold
0.15
TOTYPE
0.15
[of
0.14
Rena
0.14
glob
0.14
aldi
0.14
azÄĥ
0.14
iona
0.14
Semi
0.14
Activations Density 0.012%