INDEX
Explanations
comparisons or similarities
comparisons or examples introduced by the word "like."
New Auto-Interp
Negative Logits
icter
-0.80
itiz
-0.79
ombat
-0.78
ifest
-0.78
enary
-0.73
elf
-0.73
idates
-0.68
reth
-0.68
erb
-0.67
ribution
-0.67
POSITIVE LOGITS
lihood
1.07
ours
0.79
lier
0.75
minded
0.70
inher
0.68
preferring
0.67
liest
0.67
evidenced
0.66
grouping
0.62
ATK
0.62
Activations Density 0.081%