INDEX
Explanations
words related to attention or attraction towards a subject
instances of the word "interest" in various contexts
New Auto-Interp
Negative Logits
seams
-0.72
apple
-0.67
abby
-0.66
é¾į
-0.64
aan
-0.61
女
-0.61
thur
-0.60
é¾
-0.59
llan
-0.59
prus
-0.59
POSITIVE LOGITS
ocene
0.84
ATURE
0.81
enza
0.80
="#
0.74
topic
0.74
trolling
0.73
edIn
0.73
Groups
0.73
Interest
0.72
Rate
0.71
Activations Density 0.021%