INDEX
Explanations
phrases related to comparing or contrasting different items or concepts
comparative phrases indicating examples or references to other entities or events
New Auto-Interp
Negative Logits
ennes
-0.81
essing
-0.78
arbon
-0.75
ells
-0.74
atform
-0.73
elin
-0.73
inion
-0.71
enary
-0.69
inas
-0.69
enthusi
-0.68
POSITIVE LOGITS
lihood
1.32
lier
0.96
liest
0.92
ours
0.91
hers
0.77
yours
0.75
minded
0.67
åĭ
0.66
theirs
0.65
pick
0.65
Activations Density 0.070%