INDEX
Explanations
unique identifiers or codes
New Auto-Interp
Negative Logits
Diwedd
-0.45
,
-0.42
הערות
-0.42
Oldest
-0.41
ScopeManager
-0.41
exprim
-0.40
propOrder
-0.39
ícil
-0.39
<eos>
-0.38
edades
-0.38
POSITIVE LOGITS
DoubleQuotes
0.91
itſelf
0.77
Houſe
0.76
omock
0.69
ſelves
0.69
Reſ
0.69
Conſ
0.69
Perſ
0.68
ſelf
0.67
―――――
0.66
Activations Density 0.091%