INDEX
Explanations
references to television show seasons
New Auto-Interp
Negative Logits
GI
-0.16
agus
-0.16
oc
-0.16
_ING
-0.15
ac
-0.15
BeforeEach
-0.15
ема
-0.15
urette
-0.15
upe
-0.14
cem
-0.14
POSITIVE LOGITS
finale
0.22
premiere
0.22
Fin
0.22
fin
0.20
prem
0.19
Premiere
0.19
fin
0.18
FIN
0.17
Hodg
0.17
-fin
0.17
Activations Density 0.016%