INDEX
Explanations
occurrences of the word "talk" and its variations
New Auto-Interp
Negative Logits
ment
-0.18
arily
-0.17
ional
-0.17
orer
-0.16
ijke
-0.15
etroit
-0.15
_msgs
-0.15
unk
-0.15
MENT
-0.15
ory
-0.14
POSITIVE LOGITS
ative
0.28
SPORT
0.21
ATIVE
0.20
-talk
0.18
bubble
0.18
çŃĴ
0.17
870
0.17
about
0.17
-shop
0.17
shop
0.16
Activations Density 0.032%