INDEX
Explanations
instances of the word "watch" and its variations
New Auto-Interp
Negative Logits
Abp
-0.83
للمعارف
-0.81
Damian
-0.73
])),
-0.68
Damien
-0.68
Clough
-0.68
◆◆
-0.67
trip
-0.67
cycline
-0.66
coln
-0.66
POSITIVE LOGITS
watch
1.75
WATCH
1.68
Watch
1.67
watches
1.59
Watches
1.55
watches
1.52
watch
1.51
Watch
1.51
Watched
1.50
WATCH
1.48
Activations Density 0.051%