INDEX
Explanations
adjectives describing the strength or intensity of actions or states
references to governance or procedural issues
New Auto-Interp
Negative Logits
behavi
-0.77
ulence
-0.74
tremend
-0.67
reality
-0.65
lihood
-0.63
ethics
-0.63
figure
-0.62
neigh
-0.62
NESS
-0.62
relent
-0.62
POSITIVE LOGITS
pleted
1.15
Played
1.12
ired
1.12
iked
1.07
ried
1.07
arthed
1.05
asted
1.04
Used
1.03
Bought
1.02
Posted
1.01
Activations Density 0.229%