INDEX
Explanations
numerical values indicating measurements or quantities
New Auto-Interp
Negative Logits
2
-0.92
4
-0.81
5
-0.80
7
-0.80
8
-0.79
3
-0.79
1
-0.77
9
-0.77
ly
-0.75
6
-0.75
POSITIVE LOGITS
feroit
1.14
^(@)
1.06
wikipagina
1.05
pouvoit
1.03
auroit
1.03
mauva
1.03
avoient
1.01
ainfi
0.98
plufieurs
0.97
dévelo
0.97
Activations Density 0.642%