INDEX
Explanations
specific time-related or numeric information
New Auto-Interp
Negative Logits
pre
-0.16
bande
-0.14
four
-0.14
ellido
-0.14
¤ëĭ¤
-0.14
chein
-0.14
mist
-0.14
three
-0.14
keh
-0.14
thousand
-0.13
POSITIVE LOGITS
60
0.33
50
0.32
80
0.31
40
0.31
70
0.31
45
0.30
55
0.29
54
0.29
48
0.29
44
0.29
Activations Density 0.203%