INDEX
Explanations
instances of the word "different" and its variations
New Auto-Interp
Negative Logits
different
-0.14
khác
-0.14
uchen
-0.14
gether
-0.14
róż
-0.14
different
-0.14
ÑĩиÑħ
-0.14
ycop
-0.13
зÑĭ
-0.13
diferentes
-0.13
POSITIVE LOGITS
iating
0.67
iator
0.60
iates
0.57
iators
0.52
ially
0.51
iable
0.50
ials
0.48
iations
0.45
iated
0.44
iability
0.41
Activations Density 0.091%