INDEX
Explanations
instances of exaggeration
instances of the word "exaggeration."
New Auto-Interp
Negative Logits
itect
-0.74
worth
-0.74
rix
-0.72
rise
-0.71
ãĥĦ
-0.68
oir
-0.68
invent
-0.68
suscept
-0.65
sing
-0.65
rises
-0.65
POSITIVE LOGITS
IMAGES
0.77
deen
0.69
================================================================
0.67
Rove
0.67
thumbnails
0.66
TPPStreamerBot
0.66
arthed
0.65
pedia
0.64
Era
0.64
URI
0.63
Activations Density 0.000%