INDEX
Explanations
references to a specific individual or name
New Auto-Interp
Negative Logits
ship
-0.15
orean
-0.14
ALA
-0.14
ogonal
-0.14
ucket
-0.14
uro
-0.14
roj
-0.14
ear
-0.14
ID
-0.13
na
-0.13
POSITIVE LOGITS
mes
0.16
isse
0.16
odes
0.16
uda
0.16
šov
0.16
hit
0.15
MES
0.15
dense
0.15
.Binding
0.14
pend
0.14
Activations Density 0.019%