INDEX
Explanations
entity followed by description
New Auto-Interp
Negative Logits
Mannes
0.44
kii
0.43
щему
0.42
kien
0.41
Abram
0.40
◜
0.40
Яро
0.40
VII
0.39
AISI
0.39
郅
0.38
POSITIVE LOGITS
inquired
0.37
regained
0.35
accountable
0.35
latt
0.35
répond
0.35
ND
0.34
号
0.34
able
0.34
النوم
0.34
original
0.33
Activations Density 0.002%