INDEX
Explanations
statements or comments made by individuals
references to comments, statements, or claims made by various individuals
New Auto-Interp
Negative Logits
contrace
-0.65
izont
-0.61
unpop
-0.61
inately
-0.60
cffffcc
-0.59
mating
-0.59
psc
-0.59
enegger
-0.59
enough
-0.58
isexual
-0.58
POSITIVE LOGITS
undertaken
1.08
uttered
0.85
emanating
0.85
taken
0.84
exchanged
0.84
performed
0.83
made
0.79
done
0.78
issued
0.76
initiated
0.75
Activations Density 0.582%