INDEX
Explanations
capitalized letters at the beginning of sentences
the letter "A" or instances that contain the letter "A"
New Auto-Interp
Negative Logits
Oswald
-0.64
anism
-0.62
proceedings
-0.61
olicy
-0.61
},{"-0.61
Orient
-0.60
Abortion
-0.59
Adin
-0.58
pees
-0.57
Finish
-0.57
POSITIVE LOGITS
cknowled
1.50
cknow
1.36
HAHAHAHA
1.18
HAHA
1.13
HHHH
1.11
ctors
1.04
chieve
1.02
verages
1.00
lot
1.00
lyss
1.00
Activations Density 0.174%