INDEX
Explanations
similes using the word "like."
similes and comparisons using the word "like."
New Auto-Interp
Negative Logits
hiba
-0.90
ulty
-0.86
icators
-0.78
icator
-0.78
ourse
-0.76
icity
-0.76
ells
-0.75
idates
-0.74
urther
-0.73
Published
-0.73
POSITIVE LOGITS
liest
1.38
lihood
1.30
lier
1.19
liness
0.88
unto
0.86
ours
0.77
minded
0.71
minded
0.71
hers
0.70
wildfire
0.67
Activations Density 0.049%