INDEX
Explanations
nouns indicating a group or collection of entities
New Auto-Interp
Negative Logits
igan
-0.18
оÑĢоз
-0.16
jvu
-0.16
raÄį
-0.15
shint
-0.15
-strokes
-0.15
ãģķãģ¾
-0.14
ivas
-0.14
rim
-0.14
Ñī
-0.14
POSITIVE LOGITS
ones
0.43
Ones
0.23
ones
0.20
own
0.17
specially
0.17
Moss
0.17
0.16
oneself
0.15
spare
0.15
recently
0.15
Activations Density 0.140%