INDEX
Explanations
terms related to physical attributes or actions
references to physical aspects or attributes
New Auto-Interp
Negative Logits
ered
-0.75
adish
-0.74
bare
-0.72
gd
-0.72
quist
-0.71
ungle
-0.70
tower
-0.67
glers
-0.67
poon
-0.67
uden
-0.67
POSITIVE LOGITS
manifestation
0.98
ity
0.95
sciences
0.88
impossibility
0.87
ized
0.87
manifestations
0.85
anguage
0.85
altercation
0.84
necessities
0.83
isation
0.80
Activations Density 0.018%