INDEX
Explanations
references to hyper conditions or hyperactivity
New Auto-Interp
Negative Logits
-0.60
c
-0.54
<eos>
-0.53
-
-0.50
’
-0.49
ca
-0.49
nd
-0.49
&
-0.49
a
-0.48
w
-0.47
POSITIVE LOGITS
UnusedPrivate
1.04
saites
1.02
imprimée
0.99
PreferredItem
0.89
Credentials
0.88
fabricate
0.86
Parcelize
0.84
للمعارف
0.83
ﷺ
0.83
&___
0.83
Activations Density 0.358%