INDEX
Explanations
phrases indicating a list or continuation of information
New Auto-Interp
Negative Logits
mand
-0.63
atorium
-0.60
glas
-0.57
intrusion
-0.57
informing
-0.57
deposit
-0.57
autions
-0.55
informed
-0.55
respect
-0.55
itled
-0.55
POSITIVE LOGITS
etc
1.62
etc
1.20
Lastly
1.11
Finally
1.00
respectively
0.97
anything
0.95
ect
0.91
whatever
0.88
These
0.86
Lastly
0.86
Activations Density 2.218%