INDEX
Explanations
instances of criticism or assessment of performance
New Auto-Interp
Negative Logits
ificio
-0.17
svp
-0.15
actually
-0.15
reform
-0.14
vincia
-0.14
ekli
-0.14
Tories
-0.14
reforms
-0.14
Reform
-0.14
acie
-0.14
POSITIVE LOGITS
ric
0.16
plural
0.14
AMI
0.14
ÙħاÙĦ
0.14
oeff
0.14
ITO
0.14
cac
0.14
ONSE
0.14
lev
0.14
dressing
0.14
Activations Density 0.010%