INDEX
Explanations
mentions of chat services and platforms
New Auto-Interp
Negative Logits
naires
-0.18
halb
-0.18
hare
-0.17
naire
-0.16
cheng
-0.16
ahl
-0.16
ses
-0.16
eous
-0.15
itzer
-0.15
chestra
-0.15
POSITIVE LOGITS
anooga
0.29
ting
0.22
roulette
0.20
boxes
0.20
box
0.20
/chat
0.20
lain
0.20
reuse
0.18
bots
0.17
inch
0.17
Activations Density 0.015%