INDEX
Explanations
phrases related to the importance of specific details or characteristics
phrases indicating significant emotional or social themes
New Auto-Interp
Negative Logits
igslist
-0.69
jri
-0.68
eln
-0.66
Specifically
-0.65
rin
-0.64
Ö¼
-0.64
Previously
-0.64
bis
-0.64
bish
-0.63
querque
-0.62
POSITIVE LOGITS
whichever
1.13
ALWAYS
0.86
whoever
0.85
whatever
0.85
whatever
0.84
regardless
0.83
whether
0.79
depending
0.78
invariably
0.76
Regardless
0.75
Activations Density 0.305%