INDEX
Explanations
mentions of vehicles or vehicle-related terms
references to vehicles and related terminology
New Auto-Interp
Negative Logits
generosity
-0.73
Humanity
-0.67
Sisters
-0.67
beauty
-0.66
harmless
-0.64
Flask
-0.62
fame
-0.61
Hawaiian
-0.60
prevailing
-0.60
Patriot
-0.60
POSITIVE LOGITS
icle
1.25
icular
1.20
icles
1.19
veh
1.14
rians
1.01
rolet
0.98
rian
0.91
stant
0.84
iqueness
0.83
ijah
0.83
Activations Density 0.009%