INDEX
Explanations
facts or statements presented as truth
the phrase "the fact" and its variations
New Auto-Interp
Negative Logits
arms
-0.77
Klux
-0.74
throats
-0.72
avorite
-0.71
Interstitial
-0.66
artney
-0.65
crow
-0.64
sung
-0.63
airs
-0.63
chambers
-0.62
POSITIVE LOGITS
ional
1.06
ually
1.00
Fact
0.96
uality
0.92
fact
0.88
Fact
0.86
icity
0.86
orial
0.84
itious
0.84
undo
0.76
Activations Density 0.028%