INDEX
Explanations
pronouns and verbs related to personal interactions and conversations
New Auto-Interp
Negative Logits
Pastebin
-0.67
town
-0.67
milo
-0.65
ĸļ
-0.64
Coalition
-0.62
Sutherland
-0.61
Census
-0.61
Said
-0.57
Houses
-0.56
congress
-0.56
POSITIVE LOGITS
irresist
0.91
closer
0.88
stumble
0.77
reconsider
0.75
susceptible
0.75
awa
0.73
immune
0.72
oser
0.72
psychologically
0.72
rethink
0.72
Activations Density 1.173%