INDEX
Explanations
words related to explosions, disruptions, or intense actions
words related to burial and death
New Auto-Interp
Negative Logits
idious
-0.80
Citation
-0.59
forward
-0.58
intervening
-0.57
Benedict
-0.56
Rooney
-0.55
CoC
-0.55
Mara
-0.53
Wilkinson
-0.53
orno
-0.52
POSITIVE LOGITS
ger
1.04
cers
1.04
ged
0.96
ges
0.93
sts
0.92
cer
0.87
ts
0.86
ctions
0.84
gers
0.84
geon
0.83
Activations Density 0.100%