INDEX
Explanations
references to car brands and some related terms
New Auto-Interp
Negative Logits
Cro
-0.88
Osw
-0.86
nian
-0.85
yarn
-0.80
wu
-0.80
aina
-0.80
Meow
-0.79
Cro
-0.79
oire
-0.77
Witches
-0.73
POSITIVE LOGITS
automotive
2.07
automakers
2.03
driver
1.96
drivers
1.95
Drivers
1.93
automobile
1.87
Driver
1.87
cars
1.87
Volvo
1.85
Driver
1.83
Activations Density 0.855%