INDEX
Explanations
phrases indicating a comparison or contrast between different concepts or situations
repetitive mentions of the phrase "at least."
New Auto-Interp
Negative Logits
FTWARE
-0.77
aceutical
-0.71
Fed
-0.64
alpha
-0.63
tools
-0.62
erness
-0.61
Pac
-0.61
OTUS
-0.59
itivity
-0.58
cro
-0.58
POSITIVE LOGITS
least
1.53
onement
1.13
mosp
0.98
yp
0.94
times
0.92
ention
0.89
roph
0.86
hens
0.85
abase
0.83
variance
0.81
Activations Density 0.101%