INDEX
Explanations
mentions of specific names, particularly "Edgar" and "Wen"
references to individuals named Edgar and Wen
New Auto-Interp
Negative Logits
akia
-0.91
atures
-0.91
tto
-0.88
cloth
-0.85
ton
-0.82
tons
-0.81
ships
-0.79
tics
-0.79
bee
-0.79
body
-0.78
POSITIVE LOGITS
ENC
0.84
ENTS
0.83
ENCY
0.75
sidx
0.73
avorite
0.71
akura
0.70
ENT
0.69
vernment
0.68
Edgar
0.68
tremend
0.67
Activations Density 0.017%