INDEX
Explanations
phrases that reference the nature or properties of objects
New Auto-Interp
Negative Logits
istrovstvÃŃ
-0.18
UFFIX
-0.15
оиÑĤ
-0.14
ãĥ¬ãĥĥãĥĪ
-0.14
Sheridan
-0.14
SSF
-0.14
olf
-0.14
Ïģιο
-0.14
DITION
-0.13
γοÏģ
-0.13
POSITIVE LOGITS
kind
0.52
sort
0.46
kind
0.45
KIND
0.42
Kind
0.40
species
0.39
.kind
0.39
種
0.37
Kind
0.37
sort
0.37
Activations Density 0.042%