INDEX
Explanations
references to individuals and their roles or titles in a narrative
New Auto-Interp
Negative Logits
_Tis
-0.17
iglia
-0.16
ricks
-0.15
eer
-0.15
ÎĶε
-0.14
asure
-0.14
OTES
-0.14
Connor
-0.14
rush
-0.14
ayla
-0.14
POSITIVE LOGITS
Grant
0.18
Robert
0.17
Carl
0.16
Hod
0.15
int
0.15
Guy
0.14
Viv
0.14
Patch
0.14
ific
0.13
My
0.13
Activations Density 0.398%