INDEX
Explanations
references to individuals and personal connections
New Auto-Interp
Negative Logits
thrown
-0.17
wondered
-0.17
blown
-0.16
reminded
-0.15
conceded
-0.15
drawn
-0.15
lename
-0.15
wins
-0.15
ensch
-0.15
rů
-0.14
POSITIVE LOGITS
experienced
0.24
æĽ¾
0.23
lived
0.23
Did
0.23
Did
0.22
saw
0.22
did
0.22
played
0.21
Played
0.21
.did
0.21
Activations Density 0.503%