INDEX
Explanations
instances of the word "different," indicating a focus on contrasts or variations
New Auto-Interp
Negative Logits
Theſe
-0.80
actuels
-0.72
Jefus
-0.70
BoxFit
-0.69
kadang
-0.67
különböző
-0.65
zeiti
-0.65
houſe
-0.64
ſmall
-0.64
Shakspeare
-0.64
POSITIVE LOGITS
PropertyGroup
0.58
EconPapers
0.51
相反
0.50
unlikely
0.50
rewritten
0.49
contraire
0.49
invoke
0.49
Gupta
0.48
相比
0.48
edit
0.48
Activations Density 0.238%