INDEX
Explanations
statements expressing beliefs or opinions, especially optimism and confidence about future plans or developments
New Auto-Interp
Negative Logits
Himself
-0.67
upper
-0.65
RAW
-0.65
figure
-0.61
ãĥĺ
-0.60
Bundy
-0.59
lobe
-0.58
earthqu
-0.58
erupt
-0.58
me
-0.57
POSITIVE LOGITS
phas
0.88
fostering
0.77
onen
0.74
strongly
0.69
hement
0.69
passionately
0.68
rigorous
0.67
offering
0.66
orously
0.64
incent
0.64
Activations Density 0.161%