INDEX
Negative Logits
লেখা
0.41
geno
0.38
STO
0.37
шем
0.37
Excluding
0.37
ප්රති
0.36
পাথ
0.36
ρα
0.35
reply
0.35
াণী
0.35
POSITIVE LOGITS
Whereas
0.72
WHEREAS
0.67
Whereas
0.65
whereas
0.64
whereas
0.57
WHERE
0.52
justifying
0.50
обосно
0.48
rationale
0.46
而
0.46
Activations Density 0.006%