INDEX
Explanations
words related to claims and citations in discussions or reports
New Auto-Interp
Negative Logits
natureconservancy
-0.76
welf
-0.71
goodbye
-0.70
toast
-0.69
animate
-0.68
Himself
-0.67
dearly
-0.67
afar
-0.66
eering
-0.63
âĻ¥
-0.62
POSITIVE LOGITS
detailing
1.12
titled
1.12
uggest
1.05
outlines
1.03
headlined
1.01
outlining
0.94
itled
0.93
entitled
0.90
dated
0.88
purported
0.87
Activations Density 0.251%