INDEX
Explanations
phrases or words related to wheels
occurrences of the word "wheel" in various contexts
New Auto-Interp
Negative Logits
ropolitan
-0.83
uates
-0.79
nces
-0.73
uated
-0.72
sidx
-0.67
orescent
-0.67
ABE
-0.66
Kore
-0.66
stract
-0.64
idental
-0.64
POSITIVE LOGITS
chairs
1.33
wright
1.17
chair
1.17
wheel
1.17
wash
0.94
house
0.93
bar
0.92
base
0.91
hub
0.87
horn
0.87
Activations Density 0.016%