INDEX
Explanations
numerical data and figures within the text
New Auto-Interp
Negative Logits
icz
-0.15
lyn
-0.15
formik
-0.15
TEN
-0.14
71
-0.14
ofire
-0.14
ruz
-0.14
PullParser
-0.14
eg
-0.14
Ten
-0.14
POSITIVE LOGITS
980
0.20
440
0.16
20
0.16
960
0.16
19
0.16
pur
0.16
420
0.16
Ĥ
0.16
460
0.16
580
0.15
Activations Density 0.187%