INDEX
Explanations
negative forms of contractions
New Auto-Interp
Negative Logits
selected
-0.56
aya
-0.55
do
-0.54
DO
-0.54
O
-0.54
K
-0.53
EntryPoint
-0.52
group
-0.51
type
-0.50
DO
-0.50
POSITIVE LOGITS
wasnt
0.97
twas
0.89
'],
0.82
useAppContext
0.81
isnt
0.80
t
0.80
Зноскі
0.79
theless
0.78
Мексичка
0.77
`,
0.77
Activations Density 0.080%