INDEX
Explanations
the word "crazy" with high activations
instances of the word "crazy."
New Auto-Interp
Negative Logits
Reviewed
-0.93
fman
-0.85
tein
-0.81
agle
-0.80
arers
-0.80
ãĥĺãĥ©
-0.79
ribut
-0.78
lain
-0.77
apers
-0.76
âĢ¢âĢ¢âĢ¢âĢ¢
-0.75
POSITIVE LOGITS
crazy
0.77
nuts
0.76
shit
0.73
spawn
0.72
beasts
0.69
astically
0.66
chic
0.65
antics
0.65
ishly
0.65
fringe
0.63
Activations Density 0.023%