INDEX
Explanations
references to watching or monitoring activities
references to the organization "Watch."
New Auto-Interp
Negative Logits
routed
-0.63
phased
-0.62
urance
-0.62
depreciation
-0.62
alyses
-0.61
Sakuya
-0.61
cific
-0.61
effect
-0.61
suffix
-0.60
meaning
-0.59
POSITIVE LOGITS
Watch
3.97
watch
2.53
Watch
2.52
WATCH
2.39
watch
2.20
watches
1.73
Watching
1.62
WATCH
1.58
Wat
1.49
wat
1.35
Activations Density 0.013%