INDEX
Explanations
phrases indicating a contradiction or opposing viewpoint
affirmations or confirmations of statements
New Auto-Interp
Negative Logits
dayName
-0.95
exting
-0.86
SourceFile
-0.82
bas
-0.81
pione
-0.76
stal
-0.74
ãĤ¼ãĤ¦ãĤ¹
-0.73
teasp
-0.72
ERT
-0.71
raq
-0.69
POSITIVE LOGITS
some
1.05
there
0.99
sometimes
0.97
SOME
0.95
occasional
0.89
mistakes
0.89
occasionally
0.83
some
0.81
disagreements
0.81
we
0.81
Activations Density 0.274%