INDEX
Explanations
references to value judgments and comparisons regarding quality and importance
New Auto-Interp
Negative Logits
spin
-0.15
orts
-0.15
ito
-0.14
аж
-0.14
spinning
-0.14
spin
-0.14
ux
-0.14
genus
-0.14
вÑģÑı
-0.14
argin
-0.13
POSITIVE LOGITS
substance
0.29
substantive
0.25
actual
0.24
-content
0.23
Substance
0.23
_content
0.22
actual
0.22
adro
0.21
content
0.21
itself
0.21
Activations Density 0.399%