INDEX
Explanations
statements related to facts or factual information
mentions of "facts" and their importance in various contexts
New Auto-Interp
Negative Logits
yss
-0.90
isoft
-0.86
osi
-0.83
antha
-0.83
ovo
-0.81
zik
-0.80
leased
-0.80
hod
-0.79
rint
-0.75
interstitial
-0.75
POSITIVE LOGITS
heet
0.96
facts
0.88
facts
0.87
fulness
0.87
telling
0.86
inacc
0.79
pertaining
0.78
relating
0.72
surrounding
0.72
fact
0.71
Activations Density 0.016%