INDEX
Explanations
phrases indicating similarities or comparisons between different subjects or objects
instances of comparisons between items or concepts
New Auto-Interp
Negative Logits
Published
-0.66
mberg
-0.54
fre
-0.53
Guth
-0.53
Dispatch
-0.53
unes
-0.51
uff
-0.51
mans
-0.50
orts
-0.50
clamation
-0.50
POSITIVE LOGITS
to
0.92
thereto
0.90
twins
0.87
lihood
0.87
ively
0.79
ities
0.78
unto
0.76
sized
0.74
ĸļ
0.71
worldly
0.70
Activations Density 0.068%