INDEX
Explanations
mentions of models of cars
New Auto-Interp
Negative Logits
ovie
-0.23
ensch
-0.22
undo
-0.21
aster
-0.19
apper
-0.19
ensen
-0.18
akeup
-0.18
atrix
-0.17
áy
-0.17
ama
-0.17
POSITIVE LOGITS
akk
0.16
ab
0.16
arse
0.15
olet
0.15
oli
0.15
asp
0.15
aban
0.14
RACT
0.14
acc
0.14
AMIL
0.14
Activations Density 0.056%