INDEX
Explanations
references to irony
instances of the word "irony" and its variations
New Auto-Interp
Negative Logits
eding
-0.78
league
-0.75
á
-0.75
igree
-0.72
erto
-0.72
artney
-0.71
lished
-0.71
士
-0.69
improve
-0.69
Interstitial
-0.67
POSITIVE LOGITS
irony
1.13
twist
1.08
ironic
1.01
juxtap
0.90
netflix
0.87
Osw
0.86
paradox
0.82
mockery
0.80
wink
0.78
twists
0.76
Activations Density 0.039%