INDEX
Explanations
start tags indicating the beginning of segments or items in lists
New Auto-Interp
Negative Logits
'
-0.61
and
-0.60
-0.60
"
-0.59
(
-0.59
on
-0.56
,
-0.56
-0.52
to
-0.52
in
-0.52
POSITIVE LOGITS
متعلقه
1.01
houſe
0.86
Билгалдахарш
0.84
Anſ
0.83
iParam
0.79
purpoſe
0.79
iſt
0.78
NUMX
0.78
Houſe
0.78
esternos
0.77
Activations Density 0.172%