INDEX
Explanations
references to significant events in a game context
New Auto-Interp
Negative Logits
онÑĮ
-0.17
Moy
-0.15
ÑĢован
-0.15
atego
-0.15
нии
-0.14
favorable
-0.14
лиÑĨ
-0.14
favor
-0.14
aney
-0.14
ÄIJT
-0.14
POSITIVE LOGITS
Portuguese
0.20
ê
0.18
azer
0.16
Brazilian
0.16
ouver
0.16
ês
0.15
Porto
0.15
.bc
0.15
Portug
0.15
Portugal
0.15
Activations Density 0.162%