INDEX
Explanations
phrases related to a call to action or specific instructions
instances of the word "Show"
New Auto-Interp
Negative Logits
hurd
-0.73
é¾įå¥ij士
-0.67
Dame
-0.61
Bere
-0.59
Pes
-0.58
Vengeance
-0.56
Cologne
-0.55
kj
-0.54
ãĥ´
-0.54
Raider
-0.53
POSITIVE LOGITS
Thumbnails
0.86
isodes
0.70
iao
0.68
biz
0.67
Chart
0.66
hide
0.63
INGS
0.61
Cause
0.60
ing
0.60
tip
0.60
Activations Density 0.025%