INDEX
Explanations
instances of uncertainty or speculation
references to reflective or analytical thoughts regarding societal issues
New Auto-Interp
Negative Logits
srfAttach
-0.73
PLUS
-0.73
EMA
-0.72
Marg
-0.67
Atl
-0.65
IRT
-0.64
ENE
-0.63
CHAT
-0.63
MT
-0.63
Rats
-0.62
POSITIVE LOGITS
technically
1.08
ostensibly
0.97
initially
0.80
admittedly
0.79
theoretically
0.78
undeniably
0.74
somewhat
0.73
undoubtedly
0.69
occasional
0.68
pid
0.67
Activations Density 0.407%