INDEX
Explanations
phrases or statements with strong assertiveness or emphasis
instances of the word "it" and variations indicating a statement or assertion
New Auto-Interp
Negative Logits
Mesh
-0.60
swing
-0.60
illas
-0.59
rame
-0.59
iggurat
-0.59
reet
-0.59
awareness
-0.58
atars
-0.57
imum
-0.56
ilated
-0.55
POSITIVE LOGITS
nor
1.22
nor
0.97
yet
0.89
anymore
0.83
unless
0.81
Nor
0.76
atures
0.71
merely
0.70
nevertheless
0.66
Ļ
0.65
Activations Density 0.636%