INDEX
Explanations
instances of comparison or contrasting ideas
New Auto-Interp
Negative Logits
NegativeButton
-0.75
Rosenberg
-0.67
Henn
-0.67
Man
-0.59
Rossi
-0.57
entgen
-0.57
ریه
-0.57
entu
-0.56
acate
-0.56
sphase
-0.56
POSITIVE LOGITS
Comparisons
1.59
comparisons
1.57
comparison
1.56
compares
1.42
Comparing
1.40
Comparison
1.39
compar
1.39
Comparisons
1.35
Compar
1.35
compared
1.34
Activations Density 0.147%