INDEX
Explanations
phrases indicating a minimum or baseline condition
New Auto-Interp
Negative Logits
مشين
-0.87
Efq
-0.78
Ponta
-0.72
Theſe
-0.72
XLV
-0.72
uſed
-0.71
Fik
-0.70
toHaveBeenCalled
-0.69
Pala
-0.69
Jem
-0.69
POSITIVE LOGITS
least
1.00
jmniej
0.89
atleast
0.81
Least
0.78
almeno
0.75
least
0.73
LEAST
0.72
Least
0.69
testens
0.69
zumindest
0.68
Activations Density 0.073%