INDEX
Explanations
mentions of the word 'bot'
references to bots and bot-related activities
New Auto-Interp
Negative Logits
Barg
-0.65
Institution
-0.63
dress
-0.61
Trib
-0.60
mble
-0.60
Piercing
-0.59
Vie
-0.58
׾
-0.58
â̦â̦â̦â̦â̦â̦â̦â̦
-0.57
jerk
-0.56
POSITIVE LOGITS
anical
1.74
anic
1.33
any
1.26
ania
1.16
cham
1.16
nets
1.07
ulin
1.01
herer
1.00
zeb
0.98
anie
0.97
Activations Density 0.019%