INDEX
Explanations
punctuation marks, specifically commas
New Auto-Interp
Negative Logits
leſs
-1.01
eſt
-0.85
itſelf
-0.83
neſs
-0.82
Eſ
-0.79
faſt
-0.79
Houſe
-0.78
myſelf
-0.78
ſel
-0.77
Spon
-0.77
POSITIVE LOGITS
,
2.86
,
1.83
),
1.79
،
1.68
،
1.66
),
1.57
,\
1.49
”,
1.49
%,
1.43
,
1.40
Activations Density 0.072%