INDEX
Explanations
surface-related terms or descriptions
references to "surface" and its various contexts
New Auto-Interp
Negative Logits
enne
-0.84
eros
-0.83
rams
-0.81
orn
-0.73
avering
-0.73
orns
-0.71
umar
-0.71
anders
-0.70
endo
-0.70
naire
-0.70
POSITIVE LOGITS
FACE
0.84
tenance
0.81
facing
0.81
layer
0.79
combatants
0.77
mount
0.76
coating
0.75
area
0.72
waters
0.70
surface
0.70
Activations Density 0.046%