INDEX
Explanations
dialogues in a legal or courtroom context
New Auto-Interp
Negative Logits
etheless
-0.77
sprung
-0.69
pellets
-0.67
care
-0.65
outwe
-0.65
``
-0.65
drinkers
-0.65
aband
-0.65
fest
-0.64
eatures
-0.63
POSITIVE LOGITS
Unable
1.21
Same
1.12
Finally
1.06
Lastly
1.05
Attempt
1.05
Beginning
1.04
Interesting
1.03
Beware
1.02
Whilst
1.02
Thousands
1.02
Activations Density 1.715%