INDEX
Explanations
mentions of naval warships, particularly cruisers
references to cruise ships
New Auto-Interp
Negative Logits
Spur
-0.87
Cause
-0.72
quire
-0.67
brand
-0.65
animate
-0.64
Celeb
-0.61
guiActiveUnfocused
-0.61
Any
-0.61
idding
-0.60
hend
-0.59
POSITIVE LOGITS
cru
2.90
importantly
1.56
notably
1.18
interestingly
1.10
controvers
1.01
vo
0.89
moreover
0.88
impro
0.86
etheless
0.84
resid
0.83
Activations Density 0.014%