INDEX
Negative Logits
ginas
-0.07
Come
-0.06
ُو
-0.06
.key
-0.06
bou
-0.06
gracious
-0.06
Come
-0.06
альна
-0.06
döneminde
-0.06
ighborhood
-0.05
POSITIVE LOGITS
stractions
0.07
###↵
0.07
(Config
0.06
_wp
0.06
(hist
0.06
act
0.06
erman
0.06
```↵
0.06
TN
0.06
]) ↵ ↵
0.06
Activations Density 0.016%