INDEX
Explanations
phrases related to attracting attention or investment
New Auto-Interp
Negative Logits
Oops
-0.69
timestamp
-0.67
jab
-0.64
recount
-0.62
Printed
-0.58
format
-0.58
sleeve
-0.57
annex
-0.57
Abbey
-0.57
arat
-0.56
POSITIVE LOGITS
attention
1.06
unwanted
0.84
entious
0.81
crowds
0.74
unwelcome
0.72
xual
0.72
kefeller
0.72
attracted
0.71
hordes
0.71
dinand
0.71
Activations Density 0.041%