INDEX
Explanations
mentions of "razor" and its variations
New Auto-Interp
Negative Logits
iye
-0.17
esen
-0.15
ibur
-0.15
Howe
-0.15
enu
-0.15
ï¸
-0.14
Calder
-0.14
ιβ
-0.14
isd
-0.14
ìķł
-0.14
POSITIVE LOGITS
blade
0.21
sharp
0.19
blades
0.19
razor
0.18
sharp
0.18
Razor
0.17
blade
0.17
raz
0.17
-edge
0.17
ilet
0.17
Activations Density 0.010%