INDEX
Explanations
phrases that emphasize similarity or consistency
instances of the phrase "the same" or variations of it related to similarity
New Auto-Interp
Negative Logits
amongst
-0.71
wana
-0.69
oa
-0.68
among
-0.67
Leilan
-0.58
gew
-0.58
abin
-0.57
mund
-0.56
Soul
-0.56
isma
-0.56
POSITIVE LOGITS
same
2.73
same
2.45
Same
2.03
Same
1.87
opposite
1.70
exact
1.55
identical
1.25
inverse
1.09
reverse
1.05
ses
1.04
Activations Density 0.301%