INDEX
Explanations
references to possessive form with an apostrophe followed by an 's' and a singular noun
the letter 's' in various contexts
New Auto-Interp
Negative Logits
Compare
-0.72
Lauder
-0.71
Compare
-0.71
lude
-0.68
Berger
-0.64
leans
-0.62
bench
-0.61
roup
-0.61
lehem
-0.61
inyl
-0.61
POSITIVE LOGITS
own
1.22
selves
1.09
ELF
1.09
inability
0.99
plight
0.99
grasp
0.96
reputation
0.96
shortcomings
0.95
predicament
0.95
self
0.94
Activations Density 0.227%