INDEX
Explanations
statements made by individuals
statements attributed to individuals
New Auto-Interp
Negative Logits
irrel
-0.73
poon
-0.70
aceae
-0.66
pmwiki
-0.65
ãĥİ
-0.64
SourceFile
-0.64
aph
-0.63
tyard
-0.63
bang
-0.63
pee
-0.63
POSITIVE LOGITS
that
1.18
that
0.86
:
0.83
furthermore
0.80
adays
0.78
there
0.77
governments
0.72
:"
0.72
:]
0.72
policymakers
0.70
Activations Density 0.208%