INDEX
Explanations
patterns of comparison and similarity among subjects
New Auto-Interp
Negative Logits
Separate
-0.76
separate
-0.76
Separate
-0.73
separate
-0.69
separately
-0.69
terpisah
-0.65
eroon
-0.64
separado
-0.64
FetchType
-0.64
Individual
-0.63
POSITIVE LOGITS
similarities
0.82
common
0.82
similarity
0.82
same
0.72
similar
0.71
shared
0.71
Same
0.70
identical
0.67
共通
0.66
Same
0.66
Activations Density 0.484%