INDEX
Explanations
instances of the word "watching" and its variations
New Auto-Interp
Negative Logits
dür
-0.43
idéia
-0.43
İş
-0.42
ähteet
-0.41
kaufs
-0.37
proef
-0.36
zult
-0.36
voordeel
-0.35
mourut
-0.35
price
-0.35
POSITIVE LOGITS
watching
2.00
Watching
1.88
Watching
1.83
watching
1.83
watcher
1.41
watchers
1.41
watcher
1.26
watched
1.26
watched
1.25
WATCH
1.24
Activations Density 0.004%