INDEX
Explanations
phrases related to making serious allegations or accusations
occurrences of the verb "to be" in various forms
New Auto-Interp
Negative Logits
Fair
-0.63
Starts
-0.62
jri
-0.62
eger
-0.62
sonian
-0.62
ital
-0.62
pedia
-0.61
Reports
-0.61
Details
-0.61
inis
-0.60
POSITIVE LOGITS
nt
1.02
actually
0.84
never
0.83
somehow
0.82
truly
0.81
destined
0.79
purposely
0.77
indeed
0.76
NetMessage
0.74
genuinely
0.74
Activations Density 0.729%