INDEX
Explanations
phrases that indicate generalizations or normative statements
words related to general statements or introductions, typically starting with "Generally" or "Initially"
New Auto-Interp
Negative Logits
ggles
-0.79
aily
-0.77
kamp
-0.74
kefeller
-0.71
ocaust
-0.66
addons
-0.65
umbn
-0.65
"},"
-0.63
natureconservancy
-0.63
uras
-0.62
POSITIVE LOGITS
speaking
1.16
adays
0.90
,
0.88
there
0.87
we
0.80
they
0.78
it
0.77
,.
0.75
Speaking
0.74
speaking
0.70
Activations Density 0.173%