INDEX
Explanations
strong emotional language and opinions
New Auto-Interp
Negative Logits
WHO
-0.59
scribe
-0.59
telling
-0.57
avid
-0.56
Yesterday
-0.55
Nom
-0.54
warning
-0.54
Pattern
-0.53
raid
-0.53
SAY
-0.52
POSITIVE LOGITS
rains
1.08
mattered
1.02
happens
0.99
comes
0.97
happened
0.90
occurs
0.89
chy
0.85
transpired
0.84
occurred
0.82
becomes
0.80
Activations Density 0.077%