INDEX
Explanations
phrases indicating personal perceptions or beliefs
present tense verbs and forms of "to be" indicating personal identity and existence
New Auto-Interp
Negative Logits
eger
-0.68
pedia
-0.66
jri
-0.66
refres
-0.64
aign
-0.63
Holiday
-0.62
ibaba
-0.60
ysis
-0.60
downs
-0.59
utions
-0.59
POSITIVE LOGITS
nt
1.17
indeed
0.99
actually
0.91
somehow
0.86
never
0.83
truly
0.79
destined
0.77
incapable
0.76
not
0.75
indispensable
0.75
Activations Density 0.579%