INDEX
Explanations
key phrases with repetitive mentions of the word "fact" and emphasizing on the stated information
phrases that refer to established or important facts
New Auto-Interp
Negative Logits
avorite
-0.96
livest
-0.72
awa
-0.69
asca
-0.67
itsch
-0.66
airs
-0.66
artney
-0.66
ESE
-0.66
Skies
-0.63
lungs
-0.63
POSITIVE LOGITS
ually
1.11
ional
1.09
orial
1.04
uality
1.03
itious
0.88
oids
0.86
uate
0.84
uation
0.79
finding
0.78
uel
0.77
Activations Density 0.020%