INDEX
Explanations
references to specific car models and their specifications
New Auto-Interp
Negative Logits
ikon
-0.17
533
-0.16
coach
-0.16
crim
-0.15
coup
-0.14
cha
-0.14
chsel
-0.14
infeld
-0.14
goose
-0.14
elow
-0.14
POSITIVE LOGITS
Exporter
0.16
undles
0.15
erer
0.15
дов
0.15
onth
0.15
Barber
0.14
omba
0.14
ectl
0.14
loor
0.13
_truth
0.13
Activations Density 0.283%