INDEX
Explanations
puns or wordplay in text
instances of the word "pun" or variations thereof
New Auto-Interp
Negative Logits
;;;;;;;;;;;;
-0.71
Empires
-0.68
Covenant
-0.68
Consent
-0.67
Ae
-0.65
Close
-0.65
Transparency
-0.64
Breach
-0.63
Commons
-0.61
itutional
-0.61
POSITIVE LOGITS
isher
1.34
ishers
1.30
ting
1.13
ters
1.13
ter
1.12
cheon
1.09
cher
1.07
ishment
1.03
tering
1.00
pun
0.96
Activations Density 0.035%