INDEX
Explanations
references to processes or conditions that indicate a need for clarification or correction in factual statements
New Auto-Interp
Head Attr Weights
0:0.15
1:0.03
2:0.06
3:0.04
4:0.03
5:0.03
6:0.24
7:0.03
8:0.06
9:0.24
10:0.03
11:0.03
Negative Logits
Marqu
-3.90
athe
-3.77
Patty
-3.75
marqu
-3.53
Brewers
-3.52
tta
-3.52
Laur
-3.32
Hath
-3.31
ngth
-3.29
Eck
-3.29
POSITIVE LOGITS
Sim
10.23
Sim
10.08
sim
8.56
sim
7.81
Sims
7.68
Simulation
7.52
simulator
7.42
SIM
7.24
simulation
7.17
SIM
7.02
Activations Density 0.008%