INDEX
Explanations
proper names and names of places
various punctuation marks and transitions in textual contexts
New Auto-Interp
Negative Logits
mund
-0.75
bow
-0.66
afety
-0.62
Woodward
-0.61
arah
-0.61
direction
-0.60
leen
-0.60
fundament
-0.60
gradient
-0.58
ãĤ§
-0.58
POSITIVE LOGITS
respectively
1.44
anwhile
0.78
Interstitial
0.77
ibaba
0.75
avorite
0.74
aucuses
0.73
vying
0.73
among
0.72
latter
0.72
apiece
0.71
Activations Density 0.498%