INDEX
Explanations
verbal phrases indicating information or updates being shared
verbs indicating belief, announcement, or information disclosure
New Auto-Interp
Negative Logits
isode
-0.71
vulner
-0.63
apeake
-0.61
quin
-0.60
Exit
-0.59
stad
-0.58
vette
-0.58
lua
-0.58
ston
-0.57
ohan
-0.57
POSITIVE LOGITS
that
0.93
that
0.83
by
0.76
advisable
0.67
unanimously
0.64
amongst
0.62
BY
0.61
uality
0.59
THAT
0.58
incidentally
0.58
Activations Density 0.057%