INDEX
Explanations
the word "this."
references to specific moments or contexts in time
New Auto-Interp
Negative Logits
sylv
-0.65
keeper
-0.64
kens
-0.61
ratulations
-0.61
STEM
-0.60
scape
-0.59
ghan
-0.59
gre
-0.58
keepers
-0.58
ilies
-0.57
POSITIVE LOGITS
behest
1.21
expense
1.04
outset
1.03
intersections
0.96
helm
0.93
discretion
0.91
insistence
0.87
glance
0.86
concess
0.82
speeds
0.81
Activations Density 0.142%