INDEX
Explanations
numerical data and specific identifiers related to experimental or statistical results
New Auto-Interp
Negative Logits
purpoſe
-1.13
pleaſure
-1.13
ſtre
-1.09
ſtate
-1.08
myſelf
-1.05
uſe
-1.03
Monfieur
-1.03
Eſ
-1.02
ویکیپدی
-1.00
itſelf
-1.00
POSITIVE LOGITS
+#+
0.54
Re
0.54
ma
0.50
<tbody>
0.48
0.48
&
0.47
time
0.47
Ac
0.47
pylene
0.46
astify
0.46
Activations Density 0.178%