INDEX
Explanations
references to well-known celebrities or public figures
New Auto-Interp
Negative Logits
ofs
-0.16
akit
-0.14
RVA
-0.14
otton
-0.14
bras
-0.13
BCM
-0.13
cocci
-0.13
elves
-0.13
/Foundation
-0.13
kraje
-0.13
POSITIVE LOGITS
Fucking
0.20
mania
0.18
Freak
0.18
Netanyahu
0.17
freak
0.17
’s
0.17
vs
0.17
's
0.17
-esque
0.17
Gandhi
0.16
Activations Density 0.101%