INDEX
Explanations
mentions of readers and their experiences
New Auto-Interp
Negative Logits
ander
-0.17
ument
-0.17
bedo
-0.14
vinces
-0.14
gor
-0.14
unts
-0.14
avax
-0.14
oken
-0.14
chet
-0.14
Reputation
-0.14
POSITIVE LOGITS
hip
0.34
/view
0.28
/list
0.26
/read
0.19
/users
0.18
hips
0.18
HIP
0.17
her
0.17
/watch
0.17
-reader
0.17
Activations Density 0.025%