INDEX
Explanations
references to corporate entities or organizational names
New Auto-Interp
Negative Logits
abs
-0.18
orio
-0.18
Abs
-0.17
weis
-0.16
Abs
-0.15
-abs
-0.15
ARI
-0.15
oor
-0.15
bib
-0.15
pton
-0.15
POSITIVE LOGITS
793
0.21
illos
0.17
itar
0.16
375
0.15
near
0.15
Vin
0.15
esch
0.15
325
0.15
792
0.14
spell
0.14
Activations Density 0.004%