INDEX
Explanations
references to a specific word "Bro"
repeated mentions of the term "Bro" in various contexts
New Auto-Interp
Negative Logits
aneous
-0.70
Dull
-0.66
EDITION
-0.65
Nadu
-0.64
lessly
-0.63
orship
-0.63
lessness
-0.63
URA
-0.63
ORY
-0.61
Gemini
-0.61
POSITIVE LOGITS
ccoli
1.20
oks
1.07
gue
1.05
dy
1.01
keye
1.00
chet
0.98
thren
0.97
kens
0.97
etooth
0.95
thel
0.95
Activations Density 0.016%