INDEX
Explanations
mentions of the word "Bang."
the specific phrase "Bang" and related references to it
New Auto-Interp
Negative Logits
ensional
-0.72
cedes
-0.67
haps
-0.64
icient
-0.63
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
-0.63
VICE
-0.63
eper
-0.61
externalToEVAOnly
-0.61
Cly
-0.61
pent
-0.60
POSITIVE LOGITS
alore
1.33
kok
1.31
bang
1.11
bang
1.08
Bang
1.06
Bang
1.05
adesh
0.94
alter
0.91
la
0.84
sam
0.83
Activations Density 0.030%