INDEX
Explanations
references to displaying or presenting something
occurrences of the word "show" and its variations
New Auto-Interp
Negative Logits
kson
-0.71
adesh
-0.64
ataka
-0.63
etary
-0.61
manif
-0.60
ades
-0.60
prime
-0.60
alez
-0.59
heading
-0.59
Blackwell
-0.58
POSITIVE LOGITS
ered
1.17
biz
1.04
alter
1.03
case
0.86
cases
0.86
signs
0.84
runners
0.83
rooms
0.82
downs
0.80
boat
0.79
Activations Density 0.061%