INDEX
Explanations
phrases that indicate enumeration or lists
New Auto-Interp
Negative Logits
intrusion
-0.86
guiActiveUnfocused
-0.62
ans
-0.60
IPM
-0.59
affair
-0.59
Dragonbound
-0.59
NP
-0.57
existent
-0.57
matter
-0.57
deception
-0.57
POSITIVE LOGITS
five
0.80
seven
0.78
excerpts
0.77
some
0.76
reasons
0.76
three
0.75
suggestions
0.74
six
0.73
nine
0.72
examples
0.72
Activations Density 0.082%