INDEX
Explanations
mentions of duties, responsibilities, obligations, and tasks
references to rituals and ceremonies
New Auto-Interp
Negative Logits
Worse
-0.63
essim
-0.63
disrupting
-0.63
moil
-0.63
disrupt
-0.62
blocking
-0.61
losers
-0.61
headlines
-0.60
disrupted
-0.59
derail
-0.59
POSITIVE LOGITS
wonderful
0.87
honour
0.84
utmost
0.82
blessed
0.80
lovely
0.79
dignity
0.79
nour
0.78
honoured
0.78
entrusted
0.77
wonderfully
0.75
Activations Density 1.370%