INDEX
Negative Logits
Gardens
-1.25
nahilalakip
-1.20
experiments
-1.12
Experiments
-1.02
Experiments
-0.99
gardens
-0.99
experiments
-0.96
GARD
-0.93
betweenstory
-0.93
EXPERIMENTS
-0.91
POSITIVE LOGITS
y
0.69
s
0.50
↵↵
0.43
ی
0.42
about
0.41
,
0.39
.
0.39
at
0.39
(
0.37
k
0.36
Activations Density 0.147%