INDEX
Explanations
words that reference people and their attributes or achievements
New Auto-Interp
Negative Logits
zew
-0.17
@nate
-0.15
ideo
-0.15
raid
-0.15
TeV
-0.14
.nano
-0.14
duino
-0.14
esi
-0.14
ju
-0.14
uzz
-0.14
POSITIVE LOGITS
erd
0.16
äºķ
0.16
Jean
0.15
esper
0.15
den
0.15
cano
0.15
ding
0.15
è
0.15
cle
0.14
Cro
0.14
Activations Density 0.057%