INDEX
Explanations
sentences starting with the word "According"
the phrase "According to" and related references
New Auto-Interp
Negative Logits
helicop
-0.69
DOWN
-0.65
krit
-0.61
notor
-0.61
mathemat
-0.60
gobl
-0.59
godd
-0.59
eleph
-0.58
submar
-0.58
aden
-0.56
POSITIVE LOGITS
ly
1.31
to
1.03
edly
0.95
lly
0.90
LY
0.83
itionally
0.76
ificantly
0.73
views
0.69
itely
0.68
To
0.68
Activations Density 0.038%