INDEX
Explanations
references to barber shops
instances of the word "hop" in various contexts related to hip-hop culture
New Auto-Interp
Negative Logits
Downloadha
-0.80
MRI
-0.68
INE
-0.66
Reviewer
-0.64
UCT
-0.64
wolves
-0.63
Bath
-0.62
evil
-0.62
VID
-0.59
DIT
-0.59
POSITIVE LOGITS
hop
1.62
efully
1.31
eful
1.19
hops
1.18
Hop
1.10
daq
0.86
alog
0.86
yright
0.84
hop
0.83
otle
0.82
Activations Density 0.007%