INDEX
Explanations
phrases indicating completion or accomplishment
repeated phrases and expressions of action
New Auto-Interp
Negative Logits
Torment
-0.86
Detail
-0.75
Arri
-0.74
Objects
-0.71
Violent
-0.69
Finished
-0.67
Mour
-0.67
Persons
-0.66
Survivors
-0.66
Sett
-0.65
POSITIVE LOGITS
experiment
0.79
homework
0.77
grunt
0.72
pless
0.70
bidding
0.70
unia
0.68
diligence
0.67
pez
0.67
anecd
0.66
cially
0.66
Activations Density 0.367%