INDEX
Explanations
references to specific individuals and their actions
New Auto-Interp
Negative Logits
myſelf
-0.92
―――――
-0.90
་་
-0.89
ſind
-0.88
ſch
-0.85
Jefus
-0.84
ſche
-0.84
Anſ
-0.83
faſt
-0.82
Efq
-0.82
POSITIVE LOGITS
↵↵
0.53
<eos>
0.52
setVerticalGroup
0.51
所
0.42
↵↵↵
0.41
populaires
0.40
rungsseite
0.40
força
0.40
"
0.39
<strong>
0.38
Activations Density 0.064%