INDEX
Explanations
periods or full stops in the text
New Auto-Interp
Negative Logits
alyses
-0.76
eworld
-0.73
zsche
-0.69
opter
-0.69
onyms
-0.69
oven
-0.66
millennia
-0.66
ankind
-0.66
sworth
-0.64
spac
-0.64
POSITIVE LOGITS
icio
0.88
Malley
0.87
Bernard
0.83
Bruce
0.82
Cu
0.81
Andrew
0.81
Gavin
0.80
Gerald
0.78
Scott
0.78
Maurice
0.78
Activations Density 0.026%