INDEX
Explanations
comparisons using the word "like"
similes or comparisons using the word "like."
New Auto-Interp
Negative Logits
hiba
-0.86
inion
-0.82
ulty
-0.80
iets
-0.79
ieri
-0.74
inoa
-0.73
obook
-0.72
alf
-0.72
chin
-0.71
apist
-0.71
POSITIVE LOGITS
lihood
1.50
liest
1.04
lier
0.98
wildfire
0.89
liness
0.81
ours
0.81
clock
0.79
minded
0.78
minded
0.78
hers
0.68
Activations Density 0.082%