INDEX
Explanations
dates in the format month, day, year
punctuation or symbols in the text
New Auto-Interp
Negative Logits
Si
-0.90
Takeru
-0.90
Sag
-0.89
Sai
-0.88
ãĥĨãĤ£
-0.87
SI
-0.86
Cas
-0.85
Sage
-0.85
sg
-0.84
Si
-0.84
POSITIVE LOGITS
uph
0.96
2017
0.93
bern
0.87
uron
0.87
2017
0.85
ber
0.83
aud
0.80
zan
0.79
oust
0.77
unk
0.75
Activations Density 0.442%