INDEX
Explanations
phrases that compare entities, emphasizing the uniqueness or superiority of one over another
New Auto-Interp
Negative Logits
ugas
-0.15
always
-0.15
exactly
-0.14
inci
-0.14
rouch
-0.14
ali
-0.14
usk
-0.14
ney
-0.14
Exactly
-0.14
ека
-0.13
POSITIVE LOGITS
other
0.21
single
0.19
other
0.18
åħ¶ä»ĸ
0.17
SINGLE
0.16
others
0.16
single
0.16
others
0.16
ught
0.16
otras
0.15
Activations Density 0.032%