INDEX
Explanations
occurrences of a specific character or symbol representation
New Auto-Interp
Negative Logits
‘
-0.27
’
-0.24
‘
-0.16
’t
-0.15
’ve
-0.15
’m
-0.15
’ll
-0.15
,’’
-0.15
'_
-0.15
’den
-0.15
POSITIVE LOGITS
sf
0.25
programmes
0.18
–
0.17
.sf
0.17
sf
0.17
vt
0.16
SF
0.16
humour
0.16
SF
0.15
_sf
0.15
Activations Density 0.003%