INDEX
Explanations
occurrences and variations of the word "and"
New Auto-Interp
Negative Logits
âĵĺ
-0.15
çIJ
-0.14
plex
-0.14
bard
-0.14
485
-0.13
nas
-0.13
ignon
-0.13
ÅĪ
-0.13
bir
-0.13
èĢ
-0.13
POSITIVE LOGITS
other
0.20
дÑĢÑĥгими
0.19
altri
0.19
autres
0.18
others
0.18
its
0.18
outros
0.17
Its
0.17
дÑĢÑĥгиÑħ
0.17
Other
0.17
Activations Density 0.254%