INDEX
Explanations
occurrences of prepositions and relational phrases
New Auto-Interp
Negative Logits
plings
-0.15
onta
-0.15
ilee
-0.15
uars
-0.14
kbd
-0.14
ông
-0.14
imers
-0.14
Tight
-0.14
TURE
-0.14
appearance
-0.14
POSITIVE LOGITS
eda
0.15
ovel
0.14
poll
0.14
ousse
0.14
raction
0.14
cigars
0.13
ure
0.13
estre
0.13
APS
0.13
tact
0.13
Activations Density 0.007%