INDEX
Explanations
the word "themselves" within a sentence
references to ownership or self-referential expressions
New Auto-Interp
Negative Logits
Yar
-0.74
Dispatch
-0.73
ammy
-0.73
olid
-0.72
=-=-=-=-=-=-=-=-
-0.69
Syndicate
-0.68
oleon
-0.67
emis
-0.67
ropolis
-0.66
rop
-0.66
POSITIVE LOGITS
belonged
0.76
outwe
0.72
belong
0.70
extinguished
0.70
validated
0.70
wors
0.69
contained
0.69
proport
0.68
worshipped
0.68
extingu
0.68
Activations Density 0.034%