INDEX
Negative Logits
Flavoring
-0.80
bleacher
-0.74
pmwiki
-0.69
zbollah
-0.68
dstg
-0.68
Netflix
-0.67
£ı
-0.65
lapt
-0.64
Merit
-0.63
Gaza
-0.62
POSITIVE LOGITS
latter
0.86
him
0.72
he
0.61
"],
0.58
>(
0.56
she
0.55
Frenchman
0.54
such
0.54
both
0.53
hers
0.52
Activations Density 1.136%