INDEX
Explanations
words related to physical attributes, specifically focusing on the concept of height
references to height
New Auto-Interp
Negative Logits
vous
-0.91
eer
-0.85
vernment
-0.78
IRO
-0.77
eln
-0.77
ruption
-0.72
ktop
-0.72
trak
-0.72
Reloaded
-0.72
eers
-0.72
POSITIVE LOGITS
taller
1.03
stature
1.01
tallest
0.98
tall
0.92
ness
0.88
enough
0.86
nesses
0.82
ened
0.81
tall
0.79
weeds
0.79
Activations Density 0.010%