INDEX
Explanations
references to romantic male protagonists
New Auto-Interp
Negative Logits
Marshall
-0.17
arpa
-0.16
ä¿
-0.16
record
-0.15
-0.15
land
-0.14
Bret
-0.14
-0.14
asta
-0.14
-0.14
POSITIVE LOGITS
ocale
0.17
istrovstvÃŃ
0.16
fic
0.16
íĮ¬
0.15
allon
0.15
çĴĥ
0.14
imson
0.14
.observable
0.14
fic
0.14
.namespace
0.14
Activations Density 0.020%