INDEX
Explanations
conjunctions indicating contrast or exceptions
New Auto-Interp
Negative Logits
opsida
-0.73
/\.
-0.56
SPATH
-0.54
Parco
-0.53
ícil
-0.51
Coro
-0.50
CAG
-0.50
pewno
-0.50
Переваги
-0.49
intStringLen
-0.49
POSITIVE LOGITS
also
0.79
findpost
0.70
also
0.66
而且
0.66
también
0.63
OGND
0.61
juga
0.61
surate
0.60
아니라
0.60
també
0.59
Activations Density 0.099%