INDEX
Explanations
capitalized words "AN" followed by another capitalized word
the mention of specific acronyms or abbreviations
New Auto-Interp
Negative Logits
shale
-0.76
duct
-0.72
ucket
-0.68
ibles
-0.65
endas
-0.65
emouth
-0.64
loft
-0.62
rero
-0.62
uration
-0.62
ynasty
-0.61
POSITIVE LOGITS
AN
3.49
AN
1.57
AS
1.43
ANI
1.33
AU
1.33
SAN
1.30
AUT
1.27
MAN
1.25
An
1.24
ART
1.21
Activations Density 0.008%