INDEX
Explanations
phrases related to personal history and significant life events
New Auto-Interp
Negative Logits
redo
-0.18
rys
-0.17
riz
-0.17
rello
-0.17
Rash
-0.16
edar
-0.16
rect
-0.15
ÑĢай
-0.15
+xml
-0.15
rect
-0.15
POSITIVE LOGITS
Robert
1.09
robert
1.02
Bob
0.99
Robert
0.98
Rob
0.97
Rob
0.94
ÐłÐ¾Ð±
0.93
Bob
0.89
Roberts
0.88
bob
0.88
Activations Density 0.064%