INDEX
Explanations
sentences ending with punctuation marks
New Auto-Interp
Negative Logits
reception
-0.77
eligibility
-0.75
restricted
-0.71
access
-0.70
submerged
-0.68
fet
-0.66
enrol
-0.66
withd
-0.65
credential
-0.65
recruiting
-0.65
POSITIVE LOGITS
jpg
1.41
wikipedia
1.32
png
1.31
fm
1.28
edu
1.23
gif
1.22
exe
1.18
txt
1.17
gov
1.15
wav
1.13
Activations Density 0.191%