INDEX
Explanations
instances of the word "won" indicating victories or successes
New Auto-Interp
Negative Logits
eing
-0.18
že
-0.17
gone
-0.15
bones
-0.15
eval
-0.15
ev
-0.15
tin
-0.15
ÂĿ
-0.15
žil
-0.14
asl
-0.14
POSITIVE LOGITS
't
0.41
’t
0.40
kish
0.22
;t
0.20
´t
0.19
not
0.19
ky
0.19
ä¸įäºĨ
0.19
DER
0.18
'T
0.18
Activations Density 0.010%