INDEX
Explanations
texts related to hoaxes and pranks
references to hoaxes and pranks
New Auto-Interp
Negative Logits
oyal
-0.77
foreseen
-0.71
avid
-0.69
HCR
-0.68
arching
-0.68
ailable
-0.68
apsed
-0.67
obligated
-0.66
affer
-0.65
incinn
-0.64
POSITIVE LOGITS
hoax
1.14
sters
0.95
ulence
0.84
ishly
0.84
spoof
0.82
mong
0.78
erella
0.76
ument
0.76
prank
0.75
gee
0.74
Activations Density 0.009%