INDEX
Explanations
content that evokes humor or comedic elements
New Auto-Interp
Negative Logits
avanaugh
-0.17
semiclass
-0.15
arine
-0.14
aut
-0.14
ozem
-0.14
held
-0.14
-born
-0.13
hPa
-0.13
Sovere
-0.13
Enlarge
-0.13
POSITIVE LOGITS
ccb
0.16
ityEngine
0.15
ctors
0.14
reading
0.14
avin
0.14
óng
0.13
LOSE
0.13
rung
0.13
rede
0.13
ksam
0.13
Activations Density 0.008%