INDEX
Explanations
commonalities or shared characteristics between different entities
phrases that indicate shared characteristics or similarities
New Auto-Interp
Negative Logits
icer
-0.70
hement
-0.66
renheit
-0.65
Accountability
-0.65
veland
-0.65
ctors
-0.65
ighth
-0.64
gur
-0.63
zona
-0.62
bable
-0.61
POSITIVE LOGITS
alities
0.97
similarities
0.83
resemb
0.83
resemblance
0.78
worldly
0.70
distinguishing
0.70
twins
0.68
resembling
0.67
resemble
0.67
DragonMagazine
0.66
Activations Density 0.065%