INDEX
Explanations
names of individuals mentioned within the text
New Auto-Interp
Negative Logits
stal
-0.64
vation
-0.64
ension
-0.63
ory
-0.60
ization
-0.56
¦
-0.56
itri
-0.56
ruction
-0.56
stall
-0.55
strate
-0.55
POSITIVE LOGITS
awatts
0.62
Jem
0.60
Reviewer
0.60
itsch
0.60
oro
0.60
wei
0.58
irk
0.58
vertisement
0.57
raft
0.56
hart
0.56
Activations Density 5.055%