INDEX
Explanations
repeated references to specific outcomes or features in a structured context
New Auto-Interp
Negative Logits
ioen
-0.55
RVA
-0.51
تقاوى
-0.50
ValueStyle
-0.50
pinulongan
-0.50
emb
-0.48
AddTagHelper
-0.48
žit
-0.48
Ross
-0.46
bain
-0.45
POSITIVE LOGITS
е
3.18
ее
1.38
Е
1.30
ҽ
1.29
е
1.13
є
1.13
e
0.99
ето
0.91
0.84
ец
0.84
Activations Density 0.072%