INDEX
Explanations
references to positive or negative reception and success
the presence of the word "bo" in various contexts
New Auto-Interp
Negative Logits
dedication
-0.60
ware
-0.60
CLASSIFIED
-0.58
theless
-0.58
Millennium
-0.58
à¨
-0.56
nexus
-0.56
neutrality
-0.55
Sons
-0.55
specificity
-0.55
POSITIVE LOGITS
gey
1.30
oby
1.25
ogie
1.19
oms
1.15
quet
1.05
zos
1.02
omer
1.01
zo
1.00
ogi
0.99
ister
0.99
Activations Density 0.023%