INDEX
Explanations
action words associated with cooking and vehicle control
New Auto-Interp
Negative Logits
Lawson
-0.85
Malone
-0.74
Atlas
-0.70
Démographie
-0.69
Goldberg
-0.69
y
-0.69
Valdez
-0.68
Max
-0.67
Lawton
-0.67
pezi
-0.67
POSITIVE LOGITS
Stir
1.56
Stir
1.55
stir
1.33
stir
1.27
Stirling
1.23
stirred
1.08
stirring
0.99
stirs
0.96
steering
0.94
steering
0.92
Activations Density 0.004%