INDEX
Negative Logits
so
0.61
berharap
0.61
all
0.60
guarantees
0.59
just
0.58
partner
0.58
lids
0.57
dominions
0.57
partners
0.57
incompar
0.56
POSITIVE LOGITS
𝗔
0.60
ሔ
0.54
𝗥
0.54
𝗖
0.53
িলিয়া
0.52
𝗘
0.52
𝗨
0.52
𝗜
0.51
propan
0.50
PartialEq
0.50
Activations Density 0.000%