INDEX
Explanations
phrases related to official information or regulations
critical safety information or warnings
New Auto-Interp
Negative Logits
Streamer
-0.76
¥
-0.73
VK
-0.71
Ħ
-0.68
¼
-0.67
¡
-0.66
EMOTE
-0.65
sys
-0.65
Allows
-0.65
[+]
-0.64
POSITIVE LOGITS
arde
0.79
atown
0.75
nodd
0.73
neighb
0.72
canv
0.70
polic
0.70
reluct
0.70
conduc
0.67
ende
0.67
ourage
0.67
Activations Density 0.000%