INDEX
Explanations
phrases related to storytelling or fictional events
New Auto-Interp
Negative Logits
namely
-0.86
besides
-0.78
thood
-0.76
whenever
-0.76
according
-0.75
because
-0.74
suppose
-0.74
without
-0.74
with
-0.74
wherein
-0.73
POSITIVE LOGITS
same
1.47
latter
1.44
slightest
1.42
entirety
1.39
aforementioned
1.36
remainder
1.34
entire
1.33
smallest
1.25
ses
1.25
widest
1.17
Activations Density 4.021%