INDEX
Negative Logits
palatable
0.58
detrimental
0.57
akin
0.54
substantial
0.54
pertinentes
0.52
imperative
0.52
formidable
0.52
worthwhile
0.51
prejudicial
0.51
constituting
0.51
POSITIVE LOGITS
didn
1.24
did
1.14
gave
1.11
took
1.07
went
1.05
didnt
1.03
smiled
0.96
began
0.93
chose
0.92
showed
0.90
Activations Density 0.164%