INDEX
Explanations
occurrences of the word "Bro" and its variations
New Auto-Interp
Negative Logits
ulas
-0.18
alse
-0.17
rum
-0.16
ras
-0.16
ró
-0.15
rose
-0.15
308
-0.15
p
-0.15
pent
-0.15
ted
-0.15
POSITIVE LOGITS
oklyn
0.29
ccoli
0.26
oding
0.24
chure
0.23
okes
0.23
cade
0.22
oks
0.20
bro
0.20
iler
0.20
swer
0.19
Activations Density 0.008%