INDEX
Explanations
words related to inherent characteristics or properties
concepts related to inherent traits or qualities
New Auto-Interp
Negative Logits
kers
-0.95
pherd
-0.74
eday
-0.71
uku
-0.70
rooms
-0.69
fare
-0.68
Psychiat
-0.68
cow
-0.67
annis
-0.66
Soda
-0.66
POSITIVE LOGITS
cumbers
0.96
ulner
0.87
ities
0.85
weaknesses
0.84
dangers
0.82
flaws
0.82
inherent
0.82
urities
0.82
idad
0.77
qualities
0.77
Activations Density 0.019%