INDEX
Explanations
references to cars and vehicles
"car" or "cars"
cars and vehicles
New Auto-Interp
Negative Logits
onders
-0.55
Hinton
-0.49
Desmond
-0.49
unate
-0.49
Griffiths
-0.49
delights
-0.48
roppo
-0.48
concise
-0.48
>`;
-0.48
Delight
-0.47
POSITIVE LOGITS
Car
0.75
Vehicle
0.75
Cars
0.73
Boat
0.71
Automobile
0.70
cars
0.70
Cars
0.68
Boats
0.68
Airplane
0.68
cars
0.67
Activations Density 0.105%