INDEX
Negative Logits
umbn
-0.90
iris
-0.63
pired
-0.62
oath
-0.60
hypocr
-0.60
»Ĵ
-0.57
Ĥ¬
-0.56
pione
-0.55
ibles
-0.55
redes
-0.54
POSITIVE LOGITS
thanks
1.12
due
1.02
owing
1.02
thanks
0.95
compared
0.91
because
0.88
due
0.85
because
0.82
ecause
0.79
despite
0.77
Activations Density 0.350%