INDEX
Explanations
the word "talk" and related words
New Auto-Interp
Negative Logits
emale
-0.76
boa
-0.69
arte
-0.68
iverpool
-0.65
uilt
-0.63
hews
-0.62
ritional
-0.61
rypt
-0.60
orius
-0.60
eele
-0.59
POSITIVE LOGITS
about
0.99
aloud
0.87
louder
0.84
about
0.82
ABOUT
0.82
loudly
0.82
smack
0.80
frankly
0.77
ebus
0.75
ative
0.72
Activations Density 0.041%