INDEX
Explanations
percentages expressed as words, specifically focusing on 'per cent'
percentages and numerical statistics in the text
New Auto-Interp
Negative Logits
enthusi
-0.69
kindly
-0.64
ModLoader
-0.64
senal
-0.62
nep
-0.60
susp
-0.60
Raleigh
-0.58
anamo
-0.58
realism
-0.57
ARP
-0.56
POSITIVE LOGITS
cent
1.48
CENT
1.02
secut
0.99
missive
0.94
cent
0.90
percent
0.90
secution
0.90
iton
0.89
irteen
0.86
malink
0.85
Activations Density 0.013%