INDEX
Explanations
phrases indicating statements or arguments
statements and claims made by individuals
New Auto-Interp
Negative Logits
estern
-0.82
odcast
-0.79
ocl
-0.75
enum
-0.72
uminati
-0.69
iership
-0.69
peg
-0.69
transfer
-0.69
iffe
-0.68
avorite
-0.68
POSITIVE LOGITS
"[
0.86
it
0.84
they
0.80
instead
0.78
"'
0.78
therein
0.75
"...
0.73
that
0.72
"(
0.72
'[
0.71
Activations Density 0.085%