INDEX
Explanations
references to historical figures and etymology
New Auto-Interp
Negative Logits
ordon
-0.17
Sentence
-0.15
athe
-0.15
at
-0.15
-0.15
Sentence
-0.15
sentences
-0.15
Grammar
-0.15
public
-0.14
implied
-0.14
POSITIVE LOGITS
_NAMES
0.18
yazılı
0.17
names
0.16
istrovstvÃŃ
0.16
REFERRED
0.16
ROLLER
0.16
ĵåIJį
0.16
å§ĵåIJį
0.15
užÃŃ
0.15
-names
0.15
Activations Density 0.138%