INDEX
Explanations
proper nouns, especially names related to individuals and their titles
New Auto-Interp
Negative Logits
ervas
-0.17
Apprec
-0.15
Rena
-0.15
fName
-0.14
ÑĥлÑİ
-0.14
subject
-0.14
ٳ
-0.14
apprec
-0.14
acting
-0.13
/kernel
-0.13
POSITIVE LOGITS
.synthetic
0.16
trì
0.16
.ke
0.15
.openConnection
0.15
maf
0.15
Ä±ÅŁÄ±
0.14
.CommandType
0.14
å²
0.14
ivec
0.14
mek
0.14
Activations Density 0.028%