INDEX
Explanations
specific instances of significant actions, qualities, and attributes in various contexts
New Auto-Interp
Negative Logits
wan
-0.15
riet
-0.15
Knife
-0.14
ands
-0.14
Knife
-0.14
Corner
-0.13
ochen
-0.13
scoped
-0.13
linear
-0.13
wart
-0.13
POSITIVE LOGITS
olec
0.16
DataExchange
0.16
elah
0.15
.pretty
0.15
iks
0.14
Dragons
0.14
Holl
0.14
Bram
0.14
Ðĵолов
0.14
venta
0.14
Activations Density 0.015%