INDEX
Explanations
phrases related to exclusivity or limitation
occurrences of the word "only."
New Auto-Interp
Negative Logits
nowhere
-0.72
pipe
-0.66
rary
-0.66
phen
-0.66
idon
-0.66
insula
-0.65
zin
-0.64
hement
-0.63
cent
-0.61
ducers
-0.61
POSITIVE LOGITS
marginally
1.05
lasted
1.00
ever
0.90
scratched
0.89
cares
0.87
lasts
0.86
spor
0.85
cared
0.81
existed
0.77
pretended
0.74
Activations Density 0.059%