INDEX
Explanations
proper nouns and their correct usage
New Auto-Interp
Negative Logits
IX
-0.17
hip
-0.16
elic
-0.15
olor
-0.15
ey
-0.15
arine
-0.15
istics
-0.14
/or
-0.14
hip
-0.14
OfType
-0.14
POSITIVE LOGITS
fully
0.24
proper
0.19
ly
0.19
Proper
0.18
functioning
0.18
ment
0.17
nouns
0.17
mente
0.16
bred
0.16
-function
0.16
Activations Density 0.023%