INDEX
Explanations
words and phrases indicating comparisons or contrasting elements
New Auto-Interp
Negative Logits
indeed
-0.08
obec
-0.07
exactly
-0.07
andon
-0.07
KeyId
-0.06
také
-0.06
op
-0.06
ilent
-0.06
Indeed
-0.06
abi
-0.06
POSITIVE LOGITS
STILL
0.07
still
0.07
sometimes
0.07
azo
0.07
succ
0.06
sson
0.06
eshire
0.06
поба
0.06
enef
0.06
UILD
0.06
Activations Density 0.049%