INDEX
Explanations
phrases related to personal reflections and opinions
expressions of strong emotions and reactions
New Auto-Interp
Negative Logits
catentry
-0.81
senal
-0.66
spam
-0.65
boosting
-0.63
ikarp
-0.63
cific
-0.62
maxwell
-0.61
OG
-0.61
carbohyd
-0.60
keyword
-0.60
POSITIVE LOGITS
,—
1.04
;
0.90
.—
0.85
lest
0.80
enance
0.79
ankind
0.79
truths
0.78
enment
0.78
!
0.77
––
0.76
Activations Density 0.513%