INDEX
Explanations
occurrences of performance indicators or related metrics in the text
New Auto-Interp
Negative Logits
tright
-0.17
p
-0.15
ür
-0.15
croft
-0.15
lette
-0.14
acin
-0.14
Äĩ
-0.14
c
-0.14
angan
-0.14
john
-0.14
POSITIVE LOGITS
(s
0.24
(es
0.17
ellan
0.16
illin
0.16
ï¸ı
0.15
wards
0.15
æ´¥
0.15
еÑĢо
0.14
¼
0.14
(#)
0.14
Activations Density 0.010%