INDEX
Explanations
contractions and phrases indicating temporal references
New Auto-Interp
Negative Logits
лаÑĪ
-0.15
amarin
-0.14
emoc
-0.14
alles
-0.14
otec
-0.13
CSC
-0.13
orie
-0.13
ìĤ¬ëŀij
-0.13
ýn
-0.13
buggy
-0.13
POSITIVE LOGITS
Stanley
0.38
Stan
0.34
Stan
0.32
stan
0.31
Bonnie
0.27
-↵
0.23
Bon
0.22
Bon
0.21
bon
0.20
Guid
0.20
Activations Density 0.000%