INDEX
Explanations
references to specific individuals and their roles or actions
New Auto-Interp
Negative Logits
ldb
-0.15
uchs
-0.15
esser
-0.15
oref
-0.14
ilib
-0.14
usto
-0.14
grav
-0.14
ucch
-0.14
ucas
-0.13
inger
-0.13
POSITIVE LOGITS
seem
0.26
certainly
0.25
seems
0.22
seemed
0.21
may
0.18
Certainly
0.17
appear
0.17
clearly
0.16
recently
0.16
appears
0.16
Activations Density 0.311%