INDEX
Explanations
references to the pronoun "you."
New Auto-Interp
Negative Logits
rik
-0.18
окон
-0.15
ced
-0.15
stÃŃ
-0.15
fall
-0.15
mente
-0.15
.jp
-0.14
uyen
-0.14
illos
-0.14
ecure
-0.14
POSITIVE LOGITS
za
0.17
dart
0.15
anning
0.15
vais
0.15
erval
0.15
нÑĸвеÑĢ
0.14
ilm
0.14
uli
0.14
erdale
0.14
ocuk
0.13
Activations Density 0.190%