INDEX
Explanations
action verbs followed by and
New Auto-Interp
Negative Logits
sesuatu
0.62
şeyler
0.60
这些人
0.59
cosas
0.58
something
0.57
qualcosa
0.57
شيء
0.54
ceva
0.54
이걸
0.53
coś
0.52
POSITIVE LOGITS
and
0.70
or
0.63
এবং
0.63
અને
0.55
并
0.54
आणि
0.52
và
0.51
και
0.51
மற்றும்
0.51
_
0.50
Activations Density 0.013%