INDEX
Explanations
references to the car brand "Volkswagen."
references to Volkswagen vehicles
New Auto-Interp
Negative Logits
nces
-0.90
yip
-0.82
olulu
-0.80
xual
-0.79
mond
-0.76
thodox
-0.76
ttp
-0.75
EStream
-0.75
Uriel
-0.74
efeated
-0.73
POSITIVE LOGITS
Polo
1.00
actory
0.79
wagen
0.78
PA
0.76
VW
0.76
beetle
0.72
dealership
0.70
Golf
0.69
GH
0.69
Cola
0.68
Activations Density 0.005%