INDEX
Explanations
instances of company names or proper nouns
commonly used phrases and conjunctions that indicate lists or examples
New Auto-Interp
Negative Logits
onym
-0.63
bucks
-0.62
irie
-0.62
hei
-0.61
ide
-0.60
sein
-0.60
sburg
-0.60
ths
-0.59
ei
-0.58
isse
-0.58
POSITIVE LOGITS
which
1.37
whose
1.32
whose
1.23
which
1.19
where
1.03
wherein
1.02
whereby
0.98
who
0.95
although
0.91
aka
0.91
Activations Density 0.496%