INDEX
Explanations
instances of identical or nearly identical things in different contexts
instances of the word "identical" and related concepts of sameness
New Auto-Interp
Negative Logits
raq
-0.78
stra
-0.78
erest
-0.75
âĵĺ
-0.71
sta
-0.71
================================================================
-0.70
HI
-0.70
bara
-0.69
rhet
-0.68
uay
-0.68
POSITIVE LOGITS
twins
1.20
twin
0.90
icut
0.84
identical
0.80
lihood
0.78
sized
0.74
pairs
0.72
etrical
0.70
digits
0.70
copies
0.69
Activations Density 0.031%