INDEX
Explanations
references to fried food
New Auto-Interp
Negative Logits
_sdk
-0.17
beam
-0.17
Slave
-0.16
ivar
-0.15
ardi
-0.15
пал
-0.15
ounge
-0.15
>(
-0.14
bundles
-0.14
plates
-0.14
POSITIVE LOGITS
rich
0.25
ricks
0.21
kin
0.20
lander
0.20
sam
0.19
emann
0.19
رÙĬÙĥ
0.18
richt
0.18
helm
0.17
eric
0.17
Activations Density 0.010%