INDEX
Explanations
references to hoaxes and pranks in the text
references to hoaxes and pranks
New Auto-Interp
Negative Logits
oyal
-0.73
avid
-0.71
definition
-0.69
arching
-0.69
obligated
-0.66
uv
-0.66
arity
-0.66
ibr
-0.66
affer
-0.65
foreseen
-0.65
POSITIVE LOGITS
hoax
1.27
spoof
0.91
gee
0.88
ulence
0.85
prank
0.84
sters
0.83
mong
0.79
ãĥ¼ãĥ«
0.78
gey
0.76
etooth
0.74
Activations Density 0.012%