INDEX
Explanations
pronouns followed by words related to actions or events
the pronoun "we" and first-person expressions of collective experiences or actions
New Auto-Interp
Negative Logits
Leader
-0.74
Flavoring
-0.72
-+
-0.67
Saving
-0.66
ONSORED
-0.63
Jurassic
-0.62
Oral
-0.62
Responsibility
-0.62
odor
-0.61
Lever
-0.61
POSITIVE LOGITS
'll
1.14
've
1.08
'd
1.03
're
0.97
sych
0.94
forg
0.92
encount
0.89
learnt
0.84
eks
0.82
alian
0.81
Activations Density 0.261%