INDEX
Explanations
specifically the word "your"
occurrences of the word "your" in various contexts
New Auto-Interp
Negative Logits
apo
-0.95
forth
-0.82
Goes
-0.74
Cohn
-0.74
Originally
-0.71
ilts
-0.71
Hour
-0.69
ween
-0.68
Æ
-0.68
wik
-0.68
POSITIVE LOGITS
own
1.36
favourite
1.19
favorite
1.08
selves
0.94
anmar
0.92
adversary
0.89
opponent
0.87
ocard
0.85
subscription
0.85
self
0.85
Activations Density 0.105%