INDEX
Explanations
references to sports teams and their activities
New Auto-Interp
Negative Logits
rencont
-0.17
pong
-0.16
rema
-0.15
ÑĢеж
-0.15
phia
-0.15
follando
-0.14
æ§
-0.14
bree
-0.14
Dice
-0.13
vede
-0.13
POSITIVE LOGITS
fer
0.30
ser
0.26
Fer
0.23
oure
0.21
ollar
0.21
estar
0.21
né
0.20
fer
0.20
_ser
0.19
ser
0.19
Activations Density 0.003%