INDEX
Explanations
phrases indicating emphasis or importance
phrases emphasizing the concept of truth or reality
New Auto-Interp
Negative Logits
itches
-0.67
iband
-0.64
hoe
-0.64
throp
-0.62
inges
-0.62
aturdays
-0.61
wills
-0.61
andise
-0.60
pled
-0.60
rites
-0.59
POSITIVE LOGITS
cussion
0.91
not
0.85
NOT
0.85
olation
0.81
still
0.76
indeed
0.76
unclear
0.75
definitely
0.75
undoubtedly
0.75
ometric
0.74
Activations Density 0.185%