INDEX
Explanations
references to cartoon characters
mentions of cartoons
New Auto-Interp
Negative Logits
opio
-0.70
acia
-0.66
Sov
-0.66
olate
-0.64
pta
-0.64
Administ
-0.63
forces
-0.62
iott
-0.61
20439
-0.61
CI
-0.61
POSITIVE LOGITS
ishly
1.22
ists
1.04
ist
0.99
cartoons
0.97
ish
0.96
frog
0.96
oons
0.95
caric
0.93
cartoon
0.91
depictions
0.90
Activations Density 0.046%