INDEX
Explanations
references to playing games and leisure activities
New Auto-Interp
Negative Logits
roz
-0.15
olist
-0.15
енÑĮ
-0.15
unh
-0.14
earer
-0.14
extr
-0.14
prov
-0.13
_interp
-0.13
Dispatch
-0.13
erer
-0.13
POSITIVE LOGITS
Uno
0.25
tic
0.25
cards
0.24
Scr
0.22
dice
0.22
monopoly
0.22
opoly
0.22
board
0.21
card
0.21
Tic
0.21
Activations Density 0.188%