INDEX
Explanations
phrases related to gambling and casino activities
New Auto-Interp
Negative Logits
Ò
-0.18
dew
-0.17
ch
-0.16
ary
-0.15
оÑĢаÑı
-0.15
zh
-0.15
Fach
-0.15
mach
-0.15
asm
-0.14
ach
-0.14
POSITIVE LOGITS
Äij
0.18
uš
0.18
rž
0.18
odom
0.17
laÄį
0.17
rl
0.16
jedn
0.16
krv
0.16
eÄį
0.16
Äĩ
0.15
Activations Density 0.015%