INDEX
Explanations
references to a specific individual named Bodhi
New Auto-Interp
Negative Logits
tery
-0.18
ees
-0.17
Rout
-0.16
ptrdiff
-0.16
.raises
-0.16
ee
-0.15
xcf
-0.15
iangle
-0.15
ROUT
-0.14
ador
-0.14
POSITIVE LOGITS
Bod
0.20
oland
0.18
acious
0.18
nant
0.18
ysize
0.17
ruž
0.17
amer
0.16
kins
0.16
ÑĢий
0.16
bod
0.15
Activations Density 0.004%