INDEX
Explanations
phrases that denote relationships and memories associated with people or experiences
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.05
3:0.08
4:0.16
5:0.03
6:0.02
7:0.35
8:0.03
9:0.03
10:0.08
11:0.06
Negative Logits
dissenting
-1.70
iosyncr
-1.56
jah
-1.52
annon
-1.50
endiary
-1.50
proponent
-1.49
olute
-1.48
inately
-1.48
ente
-1.48
itte
-1.47
POSITIVE LOGITS
unfold
1.87
vividly
1.61
scenes
1.61
Remem
1.58
likeness
1.58
Haunted
1.57
anew
1.56
nightmares
1.54
memories
1.54
basics
1.54
Activations Density 0.002%