INDEX
Explanations
references to the term "Bot" and related entities
New Auto-Interp
Negative Logits
o
-0.18
es
-0.18
oth
-0.17
hip
-0.15
hurst
-0.15
slide
-0.14
ianne
-0.14
t
-0.14
Ø«
-0.14
oice
-0.14
POSITIVE LOGITS
swana
0.38
anical
0.37
ched
0.28
tega
0.27
any
0.24
anic
0.23
.bot
0.20
CHED
0.20
anik
0.20
leneck
0.19
Activations Density 0.008%