INDEX
Explanations
terminology related to configuration and operational performance
New Auto-Interp
Negative Logits
‘
-0.67
“
-0.65
Билгалдахарш
-0.56
eben
-0.55
’
-0.54
wasn
-0.52
Indeed
-0.52
indeed
-0.52
gridx
-0.52
big
-0.51
POSITIVE LOGITS
pleaſure
1.08
ſche
0.96
fubject
0.93
purpoſe
0.93
houſe
0.93
auffi
0.89
neceſſ
0.88
Monfieur
0.88
ſtate
0.86
myſelf
0.86
Activations Density 0.411%