INDEX
Explanations
words related to physical surfaces or areas
references to the surface of the Earth and its characteristics
New Auto-Interp
Negative Logits
orns
-0.78
agne
-0.78
avering
-0.77
naire
-0.77
agna
-0.77
eros
-0.76
rams
-0.75
orum
-0.75
atory
-0.73
iane
-0.73
POSITIVE LOGITS
tenance
0.88
FACE
0.86
surface
0.77
facing
0.75
layer
0.73
combatants
0.72
mount
0.72
diam
0.70
llular
0.70
surfaces
0.69
Activations Density 0.026%