INDEX
Explanations
descriptors indicating the quality or significance of nouns
New Auto-Interp
Negative Logits
olen
-0.21
acco
-0.18
iesel
-0.15
oli
-0.15
formance
-0.14
ivery
-0.14
rava
-0.14
ãĥ¬ãĥ³
-0.14
arine
-0.14
alore
-0.13
POSITIVE LOGITS
tendency
0.23
vested
0.22
knack
0.21
history
0.21
duty
0.20
obligation
0.20
vend
0.20
problem
0.19
responsibility
0.19
habit
0.19
Activations Density 0.159%