INDEX
Explanations
prepositions and auxiliary verbs related to duration and existence
New Auto-Interp
Negative Logits
'},
-1.17
")){
-1.12
"):
-1.08
"},
-1.07
^(@)
-1.00
".
-1.00
`{.-0.99
ſelves
-0.98
?>/
-0.96
'),
-0.96
POSITIVE LOGITS
.
1.44
,
1.28
;
1.27
!
1.17
?
0.98
:
0.87
)
0.77
。
0.76
(
0.75
!!
0.75
Activations Density 0.639%